+1. Its a much cleaner approach. I struggled a lot to understand what
is the use of family and how it should be translated to lucene
documents while writing data to lucene (Aaron knows!).

I think document collection is required, But in some use case where
user just want to use blur as scale able lucene and just want to store
only documents for him document grouping is overhead. So as suggested
it should be optional.

Regards,
Gagan

On Sun, Oct 13, 2013 at 12:15 AM, Aaron McCurry <[email protected]> wrote:
> Perhaps, but the interesting thing is that I think that grouping
> functionality is actually very similar.  It's just a static structure
> instead of being dynamic.  At least if I understand the solr feature
> correctly.
>
> Maybe we should call it a DocumentCollection.  Since it's a collection of
> documents.
>
> Aaron
>
>
> On Tue, Oct 1, 2013 at 10:04 PM, Otis Gospodnetic <
> [email protected]> wrote:
>
>> Hi,
>>
>> Note that Solr and Lucene both have grouping functionality, which some
>> people may confuse with DocGroups you are talking about here.
>>
>> Otis
>> --
>> Solr & ElasticSearch Support -- http://sematext.com/
>> Performance Monitoring -- http://sematext.com/spm
>>
>>
>>
>> On Mon, Sep 30, 2013 at 1:09 PM, Aaron McCurry <[email protected]> wrote:
>> > While I don't really like the idea of changing all the code to rename Row
>> > and Record, I think it is necessary to help people who are new to Blur
>> > transition from Lucene (or any other document store for that matter).
>> >
>> > I think that having Doc and DocGroup both be first class objects is also
>> > critical.  I think that for most implementations DocGroup is over kill
>> and
>> > Document is the only thing needed.  I have some ideas on how to make this
>> > possible in the API.
>> >
>> > Here's and example of what we could do, this is raw thrift which can be
>> > ugly but with some helper/utility classes it can be made better:
>> >
>> > Doc doc = new Doc();
>> > doc.setDocId(new Value(_Fields.LONG_VAL, 1234L));
>> > doc.addToFields(new Field("int_fieldname", new Value(_Fields.INT_VAL,
>> > 1234)));
>> > doc.addToFields(new Field("string_fieldname", new
>> Value(_Fields.STRING_VAL,
>> > "value1")));
>> > doc.addToFields(new Field("text_fieldname", new Value(_Fields.TEXT_VAL,
>> > "this is full text indexed.")));
>> >
>> >
>> > DocGroup docGroup = new DocGroup();
>> > docGroup.setDocGroupId(new Value(_Fields.STRING_VAL, "groupid12345"));
>> > docGroup.addToDocs(doc);
>> >
>> > At this point I think I would like to keep the docId and docGroupId.  I
>> > know that Lucene itself doesn't require it but if we don't have them
>> > deletes/updates become a lot more expensive.  They would have to
>> broadcast
>> > the delete to all the shards of a table which would kill NRT updates.
>> >
>> > Thoughts?
>> >
>> > Aaron
>> >
>> >
>> >
>> > On Mon, Sep 30, 2013 at 12:49 PM, Garrett Barton
>> > <[email protected]>wrote:
>> >
>> >> +1 here.
>> >>
>> >> I also agree with Colton about making docgroup/row optional. I know in
>> the
>> >> current design its not easy but I remember Aaron saying in the branch it
>> >> might be possible to specify any column as the I'd making me think it
>> might
>> >> be possible to not have one at all.
>> >> On Sep 30, 2013 10:41 AM, "Colton McInroy" <[email protected]>
>> wrote:
>> >>
>> >> > I personally think that the Row/Record/Column model makes sense. If
>> you
>> >> > have some documentation on the site saying here are the Lucene
>> >> equivalents
>> >> > to Blur it would probably avoid having those types of questions in the
>> >> > future. If you have an explanation of this, you could leave the model
>> the
>> >> > same to avoid having to make a bunch of changes and cause chaos.
>> >> >
>> >> > Glad the Family attribute is being dropped, I kinda came in at the
>> end of
>> >> > it's lifespan I guess, because it doesn't really make much sense to
>> me.
>> >> How
>> >> > long till it's actually dropped from the code though?
>> >> >
>> >> > One thing I would like to see is Row be an option. In my current
>> >> > implementation of Lucene code I don't use them at all, because what I
>> am
>> >> > working with makes no sense to have rows really. I also don't recall
>> >> > DocGroups being required in Lucene, and I never worked with them, so
>> that
>> >> > kinda threw me off when I ran into it.
>> >> >
>> >> > Thanks,
>> >> > Colton McInroy
>> >> >
>> >> >  * Director of Security Engineering
>> >> >
>> >> >
>> >> > Phone
>> >> > (Toll Free)
>> >> > _US_    (888)-818-1344 Press 2
>> >> > _UK_    0-800-635-0551 Press 2
>> >> >
>> >> > My Extension    101
>> >> > 24/7 Support    [email protected] <mailto:[email protected]>
>> >> > Email   [email protected] <mailto:[email protected]>
>> >> > Website         http://www.dosarrest.com
>> >> >
>> >> > On 9/30/2013 6:45 AM, Tim Williams wrote:
>> >> >
>> >> >> Hi Devs,
>> >> >> I'm wondering if we should go ahead and endure the [painful] move to
>> a
>> >> >> more intuitive data model in Blur?  Here are some observations:
>> >> >>
>> >> >> 1) New folks coming to Blur have a background in Lucene - not
>> >> >> necessarily a NoSQL data store - and want to know where their
>> >> >> "Documents" are.
>> >> >>
>> >> >> 2) For folks aware of NoSQL stores, the Row/Record model can be
>> >> >> misleading in terms of design tradeoffs.
>> >> >>
>> >> >> 3) The Row/Record model seems to bring a significant explanation
>> burden.
>> >> >>
>> >> >> In the past we've talked about a model that's more aligned with
>> >> >> Lucene's Document's.  Aaron did some api work on a branch a while
>> back
>> >> >> and it's come up in an issue again recently.
>> >> >>
>> >> >> So, I'm wondering if now is the time to just endure some shortish
>> >> >> period of pain changing everything over now?  The idea being
>> something
>> >> >> like:
>> >> >>
>> >> >> Row -> DocGroup
>> >> >> Record -> Document
>> >> >> Column -> Field
>> >> >> Family -> (dropped)
>> >> >>
>> >> >> I think this will alleviate some confusion and provide a solid
>> >> >> foundation for the long term; enabling a shorter learning curve and
>> >> >> less confusion.
>> >> >>
>> >> >> Such a big change would be good to get done while we're still a
>> >> >> small-ish community but I think it's important that everyone is on
>> >> >> board - as it will no doubt create lots of short term chaos and
>> >> >> confusion...
>> >> >>
>> >> >> Thoughts?
>> >> >>
>> >> >> Thanks,
>> >> >> --tim
>> >> >>
>> >> >
>> >> >
>> >>
>>

Reply via email to