Maybe because I am trying to just crawl a subfolder mysite.com/subfolder and I am having problems configuring it to do this and is going and crawling other pages from the parent directory.
Thanks! On Tue, Oct 4, 2016 at 4:00 AM, Markus Jelsma <[email protected]> wrote: > Well, probably because you or something indexes different stuff to the > Solr index. The first doesn't come from Nutch, the second does. > Markus > > > > -----Original message----- > > From:Nestor <[email protected]> > > Sent: Tuesday 4th October 2016 2:07 > > To: [email protected] > > Subject: why the results have diff number of fields > > > > In my solr query result for "url:*" number of returned fields vary > compare > > to my second query(see bottom) > > <result name="response" numFound="4861" start="0"> > > <doc> > > <str name="body">...</str> > > <str name="changed">2010-10-13T18:58:28</str> > > <str name="created">2010-10-13T18:58:28</str> > > <str name="entity">file</str> > > <str name="hash">hvvzxf</str> > > <str name="id">hvvzxf/file/53-623</str> > > <arr name="im_vid_9">...</arr> > > <str name="language">und</str> > > <str name="name"/> > > <str name="nid">623</str> > > <str name="path">sites/default/files/HomePage.pdf</str> > > <str name="promote">F</str> > > <str name="site">http://www.mysite.com/</str> > > <str name="sm_facetbuilder_solr_type">solr_type:facet_3</str> > > <arr name="sm_vid_Project_Type">...</arr> > > <arr name="spell">...</arr> > > <str name="ss_file_node_title">Training Test 2</str> > > <str name="ss_file_node_url">http://www.mysite.com/training-test-2</str> > > <str name="ss_filemime">application/pdf</str> > > <str name="status">T</str> > > <str name="sticky">F</str> > > <str name="teaser">...</str> > > <arr name="tid">...</arr> > > <str name="timestamp">2012-11-28T05:05:52.623</str> > > <str name="title">HomePage.pdf</str> > > <str name="ts_vid_9_names">Construction Professional Services</str> > > <str name="uid">0</str> > > <str name="url">...</str> > > <arr name="vid">...</arr> > > </doc> > > > > When I do a solr query as "content:water" I get less fields in the > results: > > <result name="response" numFound="177" start="0"> > > <doc> > > <float name="boost">0.027676692</float> > > <str name="digest">4872e938706f9bee4d928330e5713623</str> > > <str name="id">http://www.mysite.com/es/biographies</str> > > <str name="segment">20161003150513</str> > > <str name="title">Biographies</str> > > <date name="tstamp">2016-10-03T15:21:45.346Z</date> > > <str name="url">http://www.mysite.com/es/biographies</str> > > </doc> > > > > Why is that? > > > > > > Thanks, > > > > > > > > -- > > View this message in context: http://lucene.472066.n3. > nabble.com/why-the-results-have-diff-number-of-fields-tp4299378.html > > Sent from the Nutch - User mailing list archive at Nabble.com. > > > -- Né§t☼r *Authority gone to one's head is the greatest enemy of Truth*

