Re: ApacheCon 2015 at Austin, TX

2015-04-12 Thread Mike Drob
ApacheCon is starting tomorrow, so seeing if I pulling up this thread
yields any new replies this time. I'm hanging out in Austin, looking
forward to some good conversations and sessions!

On Wed, Feb 18, 2015 at 9:14 PM, CP Mishra mishr...@gmail.com wrote:

 Dmitry, that would be great.

 CP

 On Thu, Feb 12, 2015 at 5:35 AM, Dmitry Kan solrexp...@gmail.com wrote:

  Hi,
 
  Looks like I'll be there. So if you want to discuss luke / lucene / solr,
  will be happy to de-virtualize.
 
  Dmitry
 
  On Mon, Jan 12, 2015 at 6:32 PM, CP Mishra mishr...@gmail.com wrote:
 
   Hi,
  
   I am planning to attend ApacheCon 2015 at Austin, TX (Apr 13-16th) and
   wondering if there will be lucene/solr sessions in it.
  
   Anyone else planning to attend?
  
   Thanks,
   CP
  
 
 
 
  --
  Dmitry Kan
  Luke Toolbox: http://github.com/DmitryKey/luke
  Blog: http://dmitrykan.blogspot.com
  Twitter: http://twitter.com/dmitrykan
  SemanticAnalyzer: www.semanticanalyzer.info
 



Re: DocValues=true and indexed=false

2015-04-12 Thread david.w.smi...@gmail.com
Yes, surprisingly enough, if indexed=false, docValues=true — you can still
search.  I’ve seen the code behind it; it’s interesting.  Rob wrote it.
I’m not sure how scalable it is compared to the inverted index.  I suspect
it wouldn’t do well for a lot of distinct values but will fine for a small
number of them.  What definition of “small” though… I don’t know.  I’d love
to see benchmarks of such a comparison.

~ David Smiley
Freelance Apache Lucene/Solr Search Consultant/Developer
http://www.linkedin.com/in/davidwsmiley

On Thu, Apr 9, 2015 at 8:30 PM, Erick Erickson erickerick...@gmail.com
wrote:

 So I was a bit embarrassed to be asked whether there was a use-case
 for this. I did some simple tests on a field (with a magnificent total
 of 32 docs indexed) and, with the exception of an error that
 facet.mincount=0 doesn't work if a field isn't indexed (see
 SOLR-5260), everything I tried worked fine. Searching by wildcards,
 searching with the term query parser, faceting, grouping, whatever. I
 admit I didn't spend very much time looking.

 As I understand it DocValues are basically a serialized UnInverted
 field so I'm wondering how searches work at all. Or, more
 specifically, whether this works fine on small numbers of docs but
 wouldn't scale. Or does a search on a DocValues field build an
 inverted field?

 Or anything else I should know. Is the rule simply 'if you search on
 it, or use it in fq clauses, set indexed=true, and if you facet,
 group etc. set indexed=true '. So it would make sense to set
 docValues=true and indexed=false on a field that's never searched
 or used in an fq clause but used for faceting  etc.

 So my mental model is some operations need inverted fields, and some
 need uninverted fields and that docValues provide a way to store
 uninverted fields on disk just like indexed=true allows you to store
 inverted fields on disk. Assuming you need both and set both
 indexed=true and docValues=true,  the _total_ memory requirements
 for Solr are the same. What's NOT the same is that the docValues make
 use of MMapDirectory where uninverting a field doesn't (this last is a
 total guess).

 I'm preparing a Google Doc that I'll certainly permit to anyone who
 wants to add to it. I'll then add the results into the Reference
 Guide.

 Anyway, you can see I'm confused, but if I ask enough silly questions
 eventually my questions get less silly.

 Erick