Re: [External] Re: Accumulo and Arrow

2021-02-19 Thread Emilio Lahr-Vivaz
--- *From:* Emilio Lahr-Vivaz *Sent:* Wednesday, February 17, 2021 4:09 PM *To:* user@accumulo.apache.org *Subject:* Re: [External] Re: Accumulo and Arrow I believe that was a theoretical - I don't think there has been any actual integration at this point. But I'd be happy to be proven wrong :) Th

Re: [External] Re: Accumulo and Arrow

2021-02-17 Thread Emilio Lahr-Vivaz
ing-apache-arrow-a-fast-interoperable-in-memory-columnar-data-structure-standard/>. *From:* Emilio Lahr-Vivaz *Sent:* Wednesday, February 17, 2021 11:04 AM *To:* user@accumulo.apache.org *Subject:* [External] Re: Ac

Re: Accumulo and Arrow

2021-02-17 Thread Emilio Lahr-Vivaz
Hello, Do you have a link to describe the integration between HBase and Arrow? I didn't find anything except some theoretical discussions. My understanding is that Arrow is meant for in-memory representations, and there is no plan to i.e. replace HFiles or RFiles with Arrow files in

Re: Noob questions

2020-04-14 Thread Emilio Lahr-Vivaz
You should be able to use a conditional writer to support 'put if absent': https://accumulo.apache.org/docs/2.x/getting-started/clients#conditionalwriter Generally you would not want to repeatedly write the same key/value, as you will have to scan every single versioned entry when you want to

Re: Kerberos Ticket Renewal (when not updating Hadoop user)

2019-06-13 Thread Emilio Lahr-Vivaz
Hello, GeoMesa is a library (providing a GeoTools data store backed by Accumulo, among other things), so there isn't a single entry point. We could try to wrap every method call, but that would likely be complex (the GeoTools API has a lot of surface area). Would it make sense to create a

Re: Getting IOExceptions in internalRead!

2017-05-04 Thread Emilio Lahr-Vivaz
Hi Suresh, David, I think I speculated that it *might* be a bug in GeoMesa, but I haven't seen any evidence of that yet. David, do you still see those warnings after cleaning up your close methods? In normal operation, I don't see those, but possibly the access patterns you're using are

Re: Configuring batch writers

2016-07-15 Thread Emilio Lahr-Vivaz
Another thing to consider is how many tablet servers the mutations are being sent to - if they're all going to a single split, that's going to reduce your throughput a lot. On 07/15/2016 02:33 PM, dlmar...@comcast.net wrote: The batch writer has several knobs (latency time, memory buffer, etc)

Re: BatchScanner taking too much time to scan rows

2015-05-13 Thread Emilio Lahr-Vivaz
It sounds like each of your ranges is an ID, e.g. a single row. I've found that scanning lots of non-sequential single-row ranges is pretty slow in accumulo. Your best approach is probably to create an index table on whatever you are originally trying to query (assuming those 1 ids came