Re: [jira] [Closed] (BLUR-245) There is a deadlock condition that can occur during mutate batch calls.

Colton McInroy Sun, 29 Sep 2013 08:35:20 -0700

What about the family attribute?

So, does that mean rowid and recordid have to be manually generated?

Ideally, I would just like to insert records into a table... I wasthinking that I would create a table for each program that's gettingit's logs indexed. I just had a though about this though. Perhaps Icould create a table for a time period, like for a month, then use theprogram name as the rowid. That still leaves me with a recordid which Iwould prefer automatically have generated and I am not sure if it is. Ifit isn't uniquely generated, you suggest I use something likeUUID.randomUUID()?


Thanks,
Colton McInroy

 * Director of Security Engineering

        
Phone
(Toll Free)     
_US_    (888)-818-1344 Press 2
_UK_    0-800-635-0551 Press 2

My Extension    101
24/7 Support    [email protected] <mailto:[email protected]>
Email   [email protected] <mailto:[email protected]>
Website         http://www.dosarrest.com

On 9/29/2013 7:29 AM, Aaron McCurry wrote:

On Sun, Sep 29, 2013 at 9:47 AM, Colton McInroy <[email protected]>wrote:

Glad to see you resolved this Aaron.

I am just in the process of building my parsing engine right now, so I
will make sure I update my build before I start doing the mutate calls.

I have been reading the usage examples on mutate calls. I find it somewhat
odd there is only mutate and no insert as well. I guess they are probably
both treated the same. I am getting close to building the add record
component to my parsing engine, but reading the code has left me somewhat
puzzled. With lucene I treated each "Document" with various "Field" types,
with Fields also being referenced as "Categories" for the facet indexing.
Now with Blur it is much different. This mutate call seems to require three
components which I am unsure of...
The rowid is different from a recordid how?... and can I insert just rows
with automatically generated ids? The data coming in won't have any unique
id's associated with it, and with lucene in my previous experience you
never needed to specify a recordid or rowid, it would automatically create
a document id upon adding a new "Document" to the index.
I am totaly clueless as to what the family attribute is for.
I notice there are no column types. In my experience with Lucene you had
to specify the "Field" types to integer, string, etc but I see no ability
to do that in Blur. Is that handled automatically or something?

Ok, well you bring up some good points.  We have had some discussions about
renaming the objects in Blur to be closer to Lucene.

Records == Documents
Rows == Document Group
Column == Field

The rowid is present for 2 purposes.
   1. The rowid uniquely identities the group of records
   2. The rowid is used to distribute the rows evenly across all the shards
within the table.  It hashes the rowid and using the BlurPartitioner to
stored/index the row.

The recordid is used to locate the record within the row so that single
records can be fetched without the entire row.

If we go forward with the rename in 0.3.0 it will likely be something like:

Column => Field
Record => Document
Row =>DocumentGroup

RecordId => DocId
RowId => DocGroupId

Another change will be that Documents and DocumentGroups will be allowed as
indexable units (instead of just Rows now).   However the DocId and
DocGroupId will likely still be required.  You could make the UUID's or
something like that.

As far as the types, you will need to use the addColumnDefinition call:

http://incubator.apache.org/blur/docs/0.2.0/Blur.html#Fn_Blur_addColumnDefinition

And you can reference the types:

http://incubator.apache.org/blur/docs/0.2.0/data-model.html#types

Hope this helps, I know it's a bit clumsy but we have plans to improve.

Thanks,
Aaron

Thanks,
Colton McInroy

  * Director of Security Engineering


Phone
(Toll Free)
_US_    (888)-818-1344 Press 2
_UK_    0-800-635-0551 Press 2

My Extension    101
24/7 Support    [email protected] <mailto:[email protected]>
Email   [email protected] <mailto:[email protected]>
Website         http://www.dosarrest.com


On 9/29/2013 6:20 AM, Aaron McCurry (JIRA) wrote:

       [ https://issues.apache.org/**jira/browse/BLUR-245?page=com.**
atlassian.jira.plugin.system.**issuetabpanels:all-tabpanel<https://issues.apache.org/jira/browse/BLUR-245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel>]

Aaron McCurry closed BLUR-245.
------------------------------

      Resolution: Fixed

https://git-wip-us.apache.org/**repos/asf?p=incubator-blur.**
git;a=commit;h=**6b000703457e64d5c9334426ed012c**027a359eb3<https://git-wip-us.apache.org/repos/asf?p=incubator-blur.git;a=commit;h=6b000703457e64d5c9334426ed012c027a359eb3>

https://git-wip-us.apache.org/**repos/asf?p=incubator-blur.**
git;a=commit;h=**ffc817c4401ce53b6ba1b0fed70026**0d34c8acac<https://git-wip-us.apache.org/repos/asf?p=incubator-blur.git;a=commit;h=ffc817c4401ce53b6ba1b0fed700260d34c8acac>

  There is a deadlock condition that can occur during mutate batch calls.

------------------------------**------------------------------**
-----------

                  Key: BLUR-245
                  URL: 
https://issues.apache.org/**jira/browse/BLUR-245<https://issues.apache.org/jira/browse/BLUR-245>
              Project: Apache Blur
           Issue Type: Bug
           Components: Blur
     Affects Versions: 0.3.0, 0.2.1
             Reporter: Aaron McCurry
             Priority: Blocker
              Fix For: 0.3.0, 0.2.1


Basically there is a thread pool that the mutates use for performing the
mutate.  However the batch mutate call in the index manager submits a job
then in that submitted job it creates more jobs (one for each shard).  This
can cause a deadlock condition in the thread pool, because the thread pool
is a fixed size.


--
This message was sent by Atlassian JIRA
(v6.1#6144)

Re: [jira] [Closed] (BLUR-245) There is a deadlock condition that can occur during mutate batch calls.

Reply via email to