Perhaps, but the interesting thing is that I think that grouping functionality is actually very similar. It's just a static structure instead of being dynamic. At least if I understand the solr feature correctly.
Maybe we should call it a DocumentCollection. Since it's a collection of documents. Aaron On Tue, Oct 1, 2013 at 10:04 PM, Otis Gospodnetic < [email protected]> wrote: > Hi, > > Note that Solr and Lucene both have grouping functionality, which some > people may confuse with DocGroups you are talking about here. > > Otis > -- > Solr & ElasticSearch Support -- http://sematext.com/ > Performance Monitoring -- http://sematext.com/spm > > > > On Mon, Sep 30, 2013 at 1:09 PM, Aaron McCurry <[email protected]> wrote: > > While I don't really like the idea of changing all the code to rename Row > > and Record, I think it is necessary to help people who are new to Blur > > transition from Lucene (or any other document store for that matter). > > > > I think that having Doc and DocGroup both be first class objects is also > > critical. I think that for most implementations DocGroup is over kill > and > > Document is the only thing needed. I have some ideas on how to make this > > possible in the API. > > > > Here's and example of what we could do, this is raw thrift which can be > > ugly but with some helper/utility classes it can be made better: > > > > Doc doc = new Doc(); > > doc.setDocId(new Value(_Fields.LONG_VAL, 1234L)); > > doc.addToFields(new Field("int_fieldname", new Value(_Fields.INT_VAL, > > 1234))); > > doc.addToFields(new Field("string_fieldname", new > Value(_Fields.STRING_VAL, > > "value1"))); > > doc.addToFields(new Field("text_fieldname", new Value(_Fields.TEXT_VAL, > > "this is full text indexed."))); > > > > > > DocGroup docGroup = new DocGroup(); > > docGroup.setDocGroupId(new Value(_Fields.STRING_VAL, "groupid12345")); > > docGroup.addToDocs(doc); > > > > At this point I think I would like to keep the docId and docGroupId. I > > know that Lucene itself doesn't require it but if we don't have them > > deletes/updates become a lot more expensive. They would have to > broadcast > > the delete to all the shards of a table which would kill NRT updates. > > > > Thoughts? > > > > Aaron > > > > > > > > On Mon, Sep 30, 2013 at 12:49 PM, Garrett Barton > > <[email protected]>wrote: > > > >> +1 here. > >> > >> I also agree with Colton about making docgroup/row optional. I know in > the > >> current design its not easy but I remember Aaron saying in the branch it > >> might be possible to specify any column as the I'd making me think it > might > >> be possible to not have one at all. > >> On Sep 30, 2013 10:41 AM, "Colton McInroy" <[email protected]> > wrote: > >> > >> > I personally think that the Row/Record/Column model makes sense. If > you > >> > have some documentation on the site saying here are the Lucene > >> equivalents > >> > to Blur it would probably avoid having those types of questions in the > >> > future. If you have an explanation of this, you could leave the model > the > >> > same to avoid having to make a bunch of changes and cause chaos. > >> > > >> > Glad the Family attribute is being dropped, I kinda came in at the > end of > >> > it's lifespan I guess, because it doesn't really make much sense to > me. > >> How > >> > long till it's actually dropped from the code though? > >> > > >> > One thing I would like to see is Row be an option. In my current > >> > implementation of Lucene code I don't use them at all, because what I > am > >> > working with makes no sense to have rows really. I also don't recall > >> > DocGroups being required in Lucene, and I never worked with them, so > that > >> > kinda threw me off when I ran into it. > >> > > >> > Thanks, > >> > Colton McInroy > >> > > >> > * Director of Security Engineering > >> > > >> > > >> > Phone > >> > (Toll Free) > >> > _US_ (888)-818-1344 Press 2 > >> > _UK_ 0-800-635-0551 Press 2 > >> > > >> > My Extension 101 > >> > 24/7 Support [email protected] <mailto:[email protected]> > >> > Email [email protected] <mailto:[email protected]> > >> > Website http://www.dosarrest.com > >> > > >> > On 9/30/2013 6:45 AM, Tim Williams wrote: > >> > > >> >> Hi Devs, > >> >> I'm wondering if we should go ahead and endure the [painful] move to > a > >> >> more intuitive data model in Blur? Here are some observations: > >> >> > >> >> 1) New folks coming to Blur have a background in Lucene - not > >> >> necessarily a NoSQL data store - and want to know where their > >> >> "Documents" are. > >> >> > >> >> 2) For folks aware of NoSQL stores, the Row/Record model can be > >> >> misleading in terms of design tradeoffs. > >> >> > >> >> 3) The Row/Record model seems to bring a significant explanation > burden. > >> >> > >> >> In the past we've talked about a model that's more aligned with > >> >> Lucene's Document's. Aaron did some api work on a branch a while > back > >> >> and it's come up in an issue again recently. > >> >> > >> >> So, I'm wondering if now is the time to just endure some shortish > >> >> period of pain changing everything over now? The idea being > something > >> >> like: > >> >> > >> >> Row -> DocGroup > >> >> Record -> Document > >> >> Column -> Field > >> >> Family -> (dropped) > >> >> > >> >> I think this will alleviate some confusion and provide a solid > >> >> foundation for the long term; enabling a shorter learning curve and > >> >> less confusion. > >> >> > >> >> Such a big change would be good to get done while we're still a > >> >> small-ish community but I think it's important that everyone is on > >> >> board - as it will no doubt create lots of short term chaos and > >> >> confusion... > >> >> > >> >> Thoughts? > >> >> > >> >> Thanks, > >> >> --tim > >> >> > >> > > >> > > >> >
