Hi, I missed emails/issues where this functionality is described, so I'm commenting only on naming, trying to point out possible confusion with other search projects. "Collection" in Solr has a specific meaning - it's a logical index in a Solr(Cloud) cluster. So maybe that term could be avoided here, too.
Otis -- Solr & ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Sat, Oct 12, 2013 at 2:45 PM, Aaron McCurry <[email protected]> wrote: > Perhaps, but the interesting thing is that I think that grouping > functionality is actually very similar. It's just a static structure > instead of being dynamic. At least if I understand the solr feature > correctly. > > Maybe we should call it a DocumentCollection. Since it's a collection of > documents. > > Aaron > > > On Tue, Oct 1, 2013 at 10:04 PM, Otis Gospodnetic < > [email protected]> wrote: > >> Hi, >> >> Note that Solr and Lucene both have grouping functionality, which some >> people may confuse with DocGroups you are talking about here. >> >> Otis >> -- >> Solr & ElasticSearch Support -- http://sematext.com/ >> Performance Monitoring -- http://sematext.com/spm >> >> >> >> On Mon, Sep 30, 2013 at 1:09 PM, Aaron McCurry <[email protected]> wrote: >> > While I don't really like the idea of changing all the code to rename Row >> > and Record, I think it is necessary to help people who are new to Blur >> > transition from Lucene (or any other document store for that matter). >> > >> > I think that having Doc and DocGroup both be first class objects is also >> > critical. I think that for most implementations DocGroup is over kill >> and >> > Document is the only thing needed. I have some ideas on how to make this >> > possible in the API. >> > >> > Here's and example of what we could do, this is raw thrift which can be >> > ugly but with some helper/utility classes it can be made better: >> > >> > Doc doc = new Doc(); >> > doc.setDocId(new Value(_Fields.LONG_VAL, 1234L)); >> > doc.addToFields(new Field("int_fieldname", new Value(_Fields.INT_VAL, >> > 1234))); >> > doc.addToFields(new Field("string_fieldname", new >> Value(_Fields.STRING_VAL, >> > "value1"))); >> > doc.addToFields(new Field("text_fieldname", new Value(_Fields.TEXT_VAL, >> > "this is full text indexed."))); >> > >> > >> > DocGroup docGroup = new DocGroup(); >> > docGroup.setDocGroupId(new Value(_Fields.STRING_VAL, "groupid12345")); >> > docGroup.addToDocs(doc); >> > >> > At this point I think I would like to keep the docId and docGroupId. I >> > know that Lucene itself doesn't require it but if we don't have them >> > deletes/updates become a lot more expensive. They would have to >> broadcast >> > the delete to all the shards of a table which would kill NRT updates. >> > >> > Thoughts? >> > >> > Aaron >> > >> > >> > >> > On Mon, Sep 30, 2013 at 12:49 PM, Garrett Barton >> > <[email protected]>wrote: >> > >> >> +1 here. >> >> >> >> I also agree with Colton about making docgroup/row optional. I know in >> the >> >> current design its not easy but I remember Aaron saying in the branch it >> >> might be possible to specify any column as the I'd making me think it >> might >> >> be possible to not have one at all. >> >> On Sep 30, 2013 10:41 AM, "Colton McInroy" <[email protected]> >> wrote: >> >> >> >> > I personally think that the Row/Record/Column model makes sense. If >> you >> >> > have some documentation on the site saying here are the Lucene >> >> equivalents >> >> > to Blur it would probably avoid having those types of questions in the >> >> > future. If you have an explanation of this, you could leave the model >> the >> >> > same to avoid having to make a bunch of changes and cause chaos. >> >> > >> >> > Glad the Family attribute is being dropped, I kinda came in at the >> end of >> >> > it's lifespan I guess, because it doesn't really make much sense to >> me. >> >> How >> >> > long till it's actually dropped from the code though? >> >> > >> >> > One thing I would like to see is Row be an option. In my current >> >> > implementation of Lucene code I don't use them at all, because what I >> am >> >> > working with makes no sense to have rows really. I also don't recall >> >> > DocGroups being required in Lucene, and I never worked with them, so >> that >> >> > kinda threw me off when I ran into it. >> >> > >> >> > Thanks, >> >> > Colton McInroy >> >> > >> >> > * Director of Security Engineering >> >> > >> >> > >> >> > Phone >> >> > (Toll Free) >> >> > _US_ (888)-818-1344 Press 2 >> >> > _UK_ 0-800-635-0551 Press 2 >> >> > >> >> > My Extension 101 >> >> > 24/7 Support [email protected] <mailto:[email protected]> >> >> > Email [email protected] <mailto:[email protected]> >> >> > Website http://www.dosarrest.com >> >> > >> >> > On 9/30/2013 6:45 AM, Tim Williams wrote: >> >> > >> >> >> Hi Devs, >> >> >> I'm wondering if we should go ahead and endure the [painful] move to >> a >> >> >> more intuitive data model in Blur? Here are some observations: >> >> >> >> >> >> 1) New folks coming to Blur have a background in Lucene - not >> >> >> necessarily a NoSQL data store - and want to know where their >> >> >> "Documents" are. >> >> >> >> >> >> 2) For folks aware of NoSQL stores, the Row/Record model can be >> >> >> misleading in terms of design tradeoffs. >> >> >> >> >> >> 3) The Row/Record model seems to bring a significant explanation >> burden. >> >> >> >> >> >> In the past we've talked about a model that's more aligned with >> >> >> Lucene's Document's. Aaron did some api work on a branch a while >> back >> >> >> and it's come up in an issue again recently. >> >> >> >> >> >> So, I'm wondering if now is the time to just endure some shortish >> >> >> period of pain changing everything over now? The idea being >> something >> >> >> like: >> >> >> >> >> >> Row -> DocGroup >> >> >> Record -> Document >> >> >> Column -> Field >> >> >> Family -> (dropped) >> >> >> >> >> >> I think this will alleviate some confusion and provide a solid >> >> >> foundation for the long term; enabling a shorter learning curve and >> >> >> less confusion. >> >> >> >> >> >> Such a big change would be good to get done while we're still a >> >> >> small-ish community but I think it's important that everyone is on >> >> >> board - as it will no doubt create lots of short term chaos and >> >> >> confusion... >> >> >> >> >> >> Thoughts? >> >> >> >> >> >> Thanks, >> >> >> --tim >> >> >> >> >> > >> >> > >> >> >>
