Hi Hemanth,

Thank you for taking the time to respond. I will take a look at ATLAS-51
and will also be interested in hearing from others like you eluded to in
your response.

Cheers,

Sandeep.

On Sun, Dec 4, 2016 at 5:09 AM, Hemanth Yamijala <hyamij...@hortonworks.com>
wrote:

> Hi Sandeep,
>
> Responses inline. Hoping others can pitch in with more recent information,
> as mine might be a little dated.
>
> Thanks
> hemanth
> ________________________________________
> From: Sandeep Nayak <datacacoph...@gmail.com>
> Sent: Sunday, December 04, 2016 12:00 AM
> To: dev@atlas.incubator.apache.org
> Cc: Venkatesh Seetharam
> Subject: Re: Interest in Apache Atlas
>
> Hi all,
>
> Sending a reminder, I am looking for answers to the questions below. Can
> someone help?
>
> Thanks in advance for your attention.
>
> - Sandeep
>
> On Thu, Dec 1, 2016 at 12:13 AM, Sandeep Nayak <datacacoph...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > I had asked a couple questions to Venkatesh earlier please see email
> > below. He recommended that I move the questions to the dev mailing list
> and
> > thus this mail.
> >
> > To follow up on the questions asked below to my queries
> >
> > (a) Multi-tenancy: If I were to bring in data-sets from different
> > customers then I need to record, annotate or tag and provide access to
> > data-sets only to the relevant owners. Is it possible for me to record
> and
> > manage data-sets for different customers in a single Atlas instance? Does
> > Atlas provide me with the necessary constructs to separate recording of
> > data-sets by tenant and tracking metadata etc by tenant?
>
> It is possible to build a solution on top of Atlas to satisfy your
> requirements. It appears you need a namespacing facility of sorts. While
> there is no native construct like that in Atlas today (please see ATLAS-51,
> which is still open), I guess you could rely on the extensibility of the
> type system to let your objects extend from a base type that defines a
> tenant attribute. Then use wrapper APIs that filter out objects according
> to the tenant in question. Of course, one could use the lower level APIs to
> get around this, and hence it is cooperative in nature.
>
> >
> > (c) Performance Numbers: I understand it is built to scale given the use
> > of HBase but any performance numbers that can be shared will be helpful.
> > E.g. Is there a limit to the number of data-sets I can record on Atlas?
> Are
> > there performance numbers on the number of queries?
> >
>
> This is dated information (at least couple of months). If someone has
> updated numbers, we should hear from them. At that time, we tested
> importing 50K Hive tables and dependent objects (columns etc) with a total
> of about < 10M vertices.
>
> From what I remember, I think we could import these in about 20 minutes or
> so. However, this does make some assumptions about the dependencies on the
> data sets and hence we could bump up parallelism for import. We tested
> reads with queries from 30 users in parallel. Times vary based on type of
> queries - simple lookups take seconds, but more complex queries like
> lineage take longer.
>
> This is a constant source of improvement in the project and there are
> several JIRAs talking about performance changes including some that are
> still open. E.g. ATLAS-711.
>
> > (d) Are there companies using Atlas in production at this stage?
> >
> > Thanks in advance for your responses.
> >
> > - Sandeep
> >
> >
> >
> >
> > On Fri, Nov 18, 2016 at 9:10 AM, Venkatesh Seetharam <
> venkat...@apache.org
> > > wrote:
> >
> >> Sandeep - please use the dev mailing list for atlas for a prompt
> response.
> >>
> >> (a) How can one achieve multi-tenancy on Apache Atlas?
> >> Can you pls elaborate? You can always have a package structure for your
> >> data sets.
> >>
> >> (b) Is Atlas ready for production usage?
> >> It depends, I think it is but needs some scripting around BCP, etc.
> >>
> >> (c) Are there published numbers on the volume of data-sets Atlas can
> >> manage?
> >> Its built to scale, uses Titan & Hbase as a backend store which is known
> >> to scale.
> >>
> >> On Fri, Nov 4, 2016 at 12:02 PM Sandeep Nayak <datacacoph...@gmail.com>
> >> wrote:
> >>
> >>> Hi Venkatesh,
> >>>
> >>> I apologize for the direct email, if there is a better channel to
> >>> surface my questions I will be happy to go there. I am subscribed to
> >>> dev@atlas but thought that may not be the right forum for questions
> >>> potential Atlas users may have.
> >>>
> >>> I am looking for Data Catalog solutions and in early evaluation and
> from
> >>> what I read so far it appears Apache Atlas provides most of the
> >>> capabilities I am looking for. Namely data-set registration, lineage
> >>> tracking, access control (via Ranger), auditing to name a few.
> >>>
> >>> I do have a couple questions which will help me in my evaluation
> >>>
> >>> (a) How can one achieve multi-tenancy on Apache Atlas?
> >>> (b) Is Atlas ready for production usage?
> >>> (c) Are there published numbers on the volume of data-sets Atlas can
> >>> manage? One of the requirements I pointed out above is data lineage
> and if
> >>> I am ingesting streaming and batch data sets the typical volumes could
> be
> >>> very high.
> >>>
> >>> Hoping you will point me in the right direction to get answers.
> >>>
> >>> Thanks for your time and help.
> >>>
> >>> Regards,
> >>>
> >>> Sandeep
> >>>
> >>
> >
>

Reply via email to