Arshak,

Yes and no. Accumulo Combiners help a bit here.

For servicing inserts and deletes (treating an update as the combination of the two), both models work, although a serialized list is a little more tricky to manage (as most optimizations end up).

You will most likely want to have a Combiner set on your inverted index for the purposes of aggregating multiple inserts together into a single Key-Value. This happens naturally at scan time for you (by virtue of the combiner) and then gets persisted to disk in a merged for during a major compaction. The same logic can be applied to deletions. Keeping a sorted list of IDs in your serialized structure makes this algorithm pretty easy. One caveat to note is that Accumulo won't always compact *every* file in a tablet, so deletions may need to be persisted in that serialized structure to ensure that the deletion persists (we can go more into that later as I assume that isn't clear).

Speaking loosely for D4M as I haven't seen the code as to how it uses Accumulo, both should ensure referential integrity, as such, they should both be capable of servicing the same use-cases. While keeping a serialized list is a bit more work in your code, there should be performance gains seen in this approach.

On 12/29/2013 5:45 PM, Arshak Navruzyan wrote:
Josh, I am still a little stuck on the idea of how this would work in a
transactional app? (aka mixed workload of reads and writes).

I definitely see the power of using a serialized structure in order to
minimize the number of records but what will happen when rows get
deleted out of the main table (or mutated)?   In the bloated model I
could see some referential integrity code zapping the index entries as
well.  In the serialized structure design it seems pretty complex to go
and update every array that referenced that row.

Is it fair to say that the D4M approach is a little better suited for
transactional apps and the wikisearch approach is better for
read-optimized index apps?


On Sun, Dec 29, 2013 at 12:27 PM, Josh Elser <[email protected]
<mailto:[email protected]>> wrote:

    Some context here in regards to the wikisearch:

    The point of the protocol buffers here (or any serialized structure
    in the Value) is to reduce the ingest pressure and increase query
    performance on the inverted index (or transpose table, if I follow
    the d4m phrasing).

    This works well because most languages (especially English) follow a
    Zipfian distribution: some terms appear very frequently while some
    occur very infrequently. For common terms, we don't want to bloat
    our index, nor spend time creating those index records (e.g. "the").
    For uncommon terms, we still want direct access to these infrequent
    words (e.g. "__supercalifragilisticexpialidoc__ious")

    The ingest affect is also rather interesting when dealing with
    Accumulo as you're not just writing more data, but typically writing
    data to most (if not all) tservers. Even the tokenization of a
    single document is likely to create inserts to a majority of the
    tablets for your inverted index. When dealing with high ingest rates
    (live *or* bulk -- you still have the send data to these servers),
    minimizing the number of records becomes important to be cognizant
    of as it may be a bottleneck in your pipeline.

    The query implications are pretty straightforward: common terms
    don't bloat the index in size nor affect uncommon term lookups and
    those uncommon term lookups remain specific to documents rather than
    a range (shard) of documents.


    On 12/29/2013 11:57 AM, Arshak Navruzyan wrote:

        Sorry I mixed things up.  It was in the wikisearch example:

        http://accumulo.apache.org/__example/wikisearch.html
        <http://accumulo.apache.org/example/wikisearch.html>

        "If the cardinality is small enough, it will track the set of
        documents
        by term directly."


        On Sun, Dec 29, 2013 at 8:42 AM, Kepner, Jeremy - 0553 - MITLL
        <[email protected] <mailto:[email protected]>
        <mailto:[email protected] <mailto:[email protected]>>> wrote:

             Hi Arshak,
                See interspersed below.
             Regards.  -Jeremy

             On Dec 29, 2013, at 11:34 AM, Arshak Navruzyan
        <[email protected] <mailto:[email protected]>
             <mailto:[email protected] <mailto:[email protected]>>> wrote:

                 Jeremy,

                 Thanks for the detailed explanation.  Just a couple of
            final
                 questions:

                 1.  What's your advise on the transpose table as far as
            whether to
                 repeat the indexed term (one per matching row id) or
            try to store
                 all matching row ids from tedge in a single row in
            tedgetranspose
                 (using protobuf for example).  What's the performance
            implication
                 of each approach?  In the paper you mentioned that if
            it's a few
                 values they should just be stored together.  Was there
            a cut-off
                 point in your testing?


             Can you clarify?  I am not sure what your asking.


                 2.  You mentioned that the degrees should be calculated
            beforehand
                 for high ingest rates.  Doesn't this change Accumulo
            from being a
                 true database to being more of an index?  If changes to
            the data
                 cause the degree table to get out of sync, sounds like
            changes
                 have to be applied elsewhere first and Accumulo has to
            be reloaded
                 periodically.  Or perhaps letting the degree table get
            out of sync
                 is ok since it's just an assist...


             My point was a very narrow comment on optimization in very high
             performance situations. I probably shouldn't have mentioned
        it.  If
             you have ever have performance issues with your degree
        tables, that
             would be the time to discuss. . You may never encounter
        this issue.

                 Thanks,

                 Arshak


                 On Sat, Dec 28, 2013 at 10:36 AM, Kepner, Jeremy - 0553
            - MITLL
                 <[email protected] <mailto:[email protected]>
            <mailto:[email protected] <mailto:[email protected]>>> wrote:

                     Hi Arshak,
                       Here is how you might do it.  We implement
            everything with
                     batch writers and batch scanners.  Note: if you are
            doing high
                     ingest rates, the degree table can be tricky and
            usually
                     requires pre-summing prior to ingestion to reduce
            the pressure
                     on the accumulator inside of Accumulo.  Feel free
            to ask
                     further questions as I would imagine that there a
            details that
                     still wouldn't be clear.  In particular, why we do
            it this way.

                     Regards.  -Jeremy

                     Original data:

                     Machine,Pool,Load,__ReadingTimestamp
                     neptune,west,5,1388191975000
                     neptune,west,9,1388191975010
                     pluto,east,13,1388191975090


                     Tedge table:
                     rowKey,columnQualifier,value

                     0005791918831-neptune,Machine|__neptune,1
                     0005791918831-neptune,Pool|__west,1
                     0005791918831-neptune,Load|5,1

            0005791918831-neptune,__ReadingTimestamp|__1388191975000,1
                     0105791918831-neptune,Machine|__neptune,1
                     0105791918831-neptune,Pool|__west,1
                     0105791918831-neptune,Load|9,1

            0105791918831-neptune,__ReadingTimestamp|__1388191975010,1
                     0905791918831-pluto,Machine|__pluto,1
                     0905791918831-pluto,Pool|east,__1
                     0905791918831-pluto,Load|13,1

            0905791918831-pluto,__ReadingTimestamp|__1388191975090,1


                     TedgeTranspose table:
                     rowKey,columnQualifier,value

                     Machine|neptune,0005791918831-__neptune,1
                     Pool|west,0005791918831-__neptune,1
                     Load|5,0005791918831-neptune,1

            ReadingTimestamp|__1388191975000,0005791918831-__neptune,1
                     Machine|neptune,0105791918831-__neptune,1
                     Pool|west,0105791918831-__neptune,1
                     Load|9,0105791918831-neptune,1

            ReadingTimestamp|__1388191975010,0105791918831-__neptune,1
                     Machine|pluto,0905791918831-__pluto,1
                     Pool|east,0905791918831-pluto,__1
                     Load|13,0905791918831-pluto,1

            ReadingTimestamp|__1388191975090,0905791918831-__pluto,1


                     TedgeDegree table:
                     rowKey,columnQualifier,value

                     Machine|neptune,Degree,2
                     Pool|west,Degree,2
                     Load|5,Degree,1
                     ReadingTimestamp|__1388191975000,Degree,1
                     Load|9,Degree,1
                     ReadingTimestamp|__1388191975010,Degree,1
                     Machine|pluto,Degree,1
                     Pool|east,Degree,1
                     Load|13,Degree,1
                     ReadingTimestamp|__1388191975090,Degree,1


                     TedgeText table:
                     rowKey,columnQualifier,value

                     0005791918831-neptune,Text,< ... raw text of
            original log ...>
                     0105791918831-neptune,Text,< ... raw text of
            original log ...>
                     0905791918831-pluto,Text,< ... raw text of original
            log ...>

                     On Dec 27, 2013, at 8:01 PM, Arshak Navruzyan
                     <[email protected] <mailto:[email protected]>
            <mailto:[email protected] <mailto:[email protected]>>> wrote:

                     > Jeremy,
                     >
                     > Wow, didn't expect to get help from the author :)
                     >
                     > How about something simple like this:
                     >
                     > Machine    Pool      Load        ReadingTimestamp
                     > neptune     west      5            1388191975000
                     > neptune     west      9            1388191975010
                     > pluto         east       13           1388191975090
                     >
                     > These are the areas I am unclear on:
                     >
                     > 1.  Should the transpose table be built as part
            of ingest
                     code or as an accumulo combiner?
                     > 2.  What does the degree table do in this example
            ?  The
                     paper mentions it's useful for query optimization.
              How?
                     > 3.  Does D4M accommodate "repurposing" the row_id
            to a
                     partition key?  The wikisearch shows how the
            partition id is
                     important for parallel scans of the index.  But
            since Accumulo
                     is a row store how can you do fast lookups by row
            if you've
                     used the row_id as a partition key.
                     >
                     > Thank you,
                     >
                     > Arshak
                     >
                     >
                     >
                     >
                     >
                     >
                     > On Thu, Dec 26, 2013 at 5:31 PM, Jeremy Kepner
                     <[email protected] <mailto:[email protected]>
            <mailto:[email protected] <mailto:[email protected]>>> wrote:
                     > Hi Arshak,
                     >   Maybe you can send a few (~3) records of data
            that you are
                     familiar with
                     > and we can walk you through how the D4M schema
            would be
                     applied to those records.
                     >
                     > Regards.  -Jeremy
                     >
                     > On Thu, Dec 26, 2013 at 03:10:59PM -0500, Arshak
            Navruzyan
                     wrote:
                     > >    Hello,
                     > >    I am trying to get my head around Accumulo
            schema
                     designs.  I went through
                     > >    a lot of trouble to get the wikisearch
            example running
                     but since the data
                     > >    in protobuf lists, it's not that
            illustrative (for a
                     newbie).
                     > >    Would love to find another example that is a
            little
                     simpler to understand.
                     > >     In particular I am interested in java/scala
            code that
                     mimics the D4M
                     > >    schema design (not a Matlab guy).
                     > >    Thanks,
                     > >    Arshak
                     >





Reply via email to