To be clear, Connector is just a pairing of username/authentication with an instance. There are no connections or other resources involved. Sure, there's some memory needed to remember those bits of information, but it's just a few bytes (ok, like 1K).
Create a batch scanner, though, and there's threads, cached connections to the tablet servers, etc. Now you've started to use some precious resources. So, if you have 3M users, expect to load-balance their requests, not because of Connector objects, but because of the requests they will make. The "superuser approach" is common. Well, a "queryuser" who has access to read some set of tables, but is limited using the appropriate authorizations for a specific real-life person. For example, you have a "doctor" user, who can read patient data. However, Dr. Smith needs the authorization "eric.newton" to read my information. When a doctor makes a request, the query infrastructure looks up their authorizations and applies it to their request. The doctor tables are available to any doctor, but they can only read my data if the system adds the authorizations for my data. But this is just management of authorizations, and not client resources. This does decrease security a bit: if your application of authorizations (eric.newton) to real-life person (Dr. Smith) is incorrect, those bugs might allow unauthorized access. If this is a concern, you can partition your data to put highly sensitive data into a table that requires a more restrictive user. For example, you might put credit card information into a table available only to a user with a need for that information: most requests will be done by a different user to different tables. Scanners, and more specifically, the client-side iterators they create, are the resource hogs. Everything else is bookkeeping. -Eric On Tue, Dec 1, 2015 at 4:11 AM, mohit.kaushik <mohit.kaus...@orkash.com> wrote: > Josh, > > If resources is a concern, would it be better to use superuser approach , > single user having all authorizations assigned and using scanner to provide > user authorizations. Does it decreases the security level? How does the > custom authenticator and authorizers help in this case? and how can I > implement them if needed? > > Thanks > Mohit Kaushik > > > On 11/30/2015 08:59 PM, Josh Elser wrote: > > Connector is tied to a specific user, so you're tied to a user for a given > instance. > > I'm not aware of any testing in that direction (lots of active > connectors). Connectors aren't particularly heavy, you could keep some > cache of recently used instances and recreate them when they were evicted > from the cache due to inactivity. > > The only fundamental limitation of concurrent Connector instances that I > can think of is at the RPC level. Eventually, the RPCs that the Connector > is making to Accumulo servers correlates to server-side resources which are > finite. If you have some reasonable hardware, I don't think this is a real > concern. > > Would be curious to hear back how this works. > > mohit.kaushik wrote: > > I am creating a connector per user as every user has different > authorizations sets. I want to know, is there any limit on creating > Accumulo connectors, what is the maximum number of connector that > Accumulo can handle?. For example if My application will have 3M users, > Is it correct to create 3M connections for them or there is any way to > share connections for different users having different authorizations? > > Thanks > Mohit Kaushik > > > >