A close method would definitely help. I think there's also a concern of
deadlock (I keep waffling on this without just writing some code to
(dis)prove it).
Consider the following (since we just had the Phoenix question hit the
list):
You have some IndexUpdatingIterator that writes out new records on MinC.
The data tablet and secondary index tablet end up residing on the same
tabletserver. When you start flushing the data tablet, you try to create
the secondary index records which will never finish until you complete
the minc: Deadlock. To get around this would be a fairly big change in
how the tabletserver manages memory and writes -- I can't speak as to if
its even feasible without reading more code.
Maybe it's enough to not hook into it at MinC scope? MajC would have a
bit of a delay involved for that index update. You would probably want
to write some local data to know when you updated the index too, so you
don't repeatedly update it.
My gut still tells me that trying to focus on percolator would be better
(given that the problem posed is typically analogous to what percolator
describes). Maybe we can encourage Keith to give a little overview as to
what the current state is, where he thinks it needs to go, and where in
the code patches would be good to hit :)
On 4/29/14, 10:27 AM, Donald Miner wrote:
Just to be clear, I'm not asking "why shouldn't I do this"... I'm asking
"what can added feature-wise to accumulo to support this?" ... because I
want to do it :)
So, I guess if there was a close method on an iterator that got called when
it was torn down... that would help?
On Tue, Apr 29, 2014 at 10:24 AM, <[email protected]> wrote:
One reason that I can think of is that there is not a close() method on
the iterator interface. If you had resources open, you won't know when to
clean them up.
----- Original Message -----
From: "Donald Miner" <[email protected]>
To: [email protected]
Sent: Tuesday, April 29, 2014 10:20:40 AM
Subject: Writing data from iterators
Bit of a tangent... This came up earlier in the text indexing thread and
below, and is something I've seen come up a couple of other times.
What would it take to make it so something tabletserver-side could write to
accumulo (in an "acceptable way")? Either be it an iterator/constraint or
something new.
-d
On Tue, Apr 29, 2014 at 9:59 AM, Eric Newton <[email protected]>
wrote:
I have taken a quick look at phoenix. It's baked into HBase-specific
features pretty hard.
It uses coprocessors to do things like create index entries. This is a
common enough idiom in the HBase community, but not something we've
supported in Accumulo. In general, you do not want an accumulo Iterator
or
Constraint generating data for other tables.
However, a more sophisticated Percolator type implementation (
https://github.com/keith-turner/Accismus) could support index generation
and query transactions.
We could probably re-use a lot of it, but it's not going to be as simple
as
changing the classes that talk to the database back-end.
-Eric
On Tue, Apr 29, 2014 at 9:21 AM, Kepner, Jeremy - 0553 - MITLL <
[email protected]> wrote:
Hi James,
Can you explain how the SQL layer to HBase works?
Regards. -Jeremy
On Apr 29, 2014, at 1:32 AM, James Taylor <[email protected]>
wrote:
Hello,
Would there be any interest in developing a SQL-layer on top of
Accumulo?
I'm part of the Apache Phoenix project and we've built a similar
system
on
top of HBase. I wanted to see if there'd be interest on your end at
working
with us to generalizing our client and provide in a server that would
do
Accumulo-specific push down in support of a SQL layer. I suspect
there's
enough similarity between HBase and Accumulo that this would be
feasible.
Thanks,
James
--
Donald Miner
Chief Technology Officer
ClearEdge IT Solutions, LLC
Cell: 443 799 7807
www.clearedgeit.com