Maybe because I am trying to just crawl a subfolder mysite.com/subfolder and
I am having problems configuring it to do this and is going and crawling
other pages from the parent directory.

Thanks!



On Tue, Oct 4, 2016 at 4:00 AM, Markus Jelsma <[email protected]>
wrote:

> Well, probably because you or something indexes different stuff to the
> Solr index. The first doesn't come from Nutch, the second does.
> Markus
>
>
>
> -----Original message-----
> > From:Nestor <[email protected]>
> > Sent: Tuesday 4th October 2016 2:07
> > To: [email protected]
> > Subject: why the results have diff number of fields
> >
> > In my solr query result for "url:*" number of returned fields vary
> compare
> > to my second query(see bottom)
> > <result name="response" numFound="4861" start="0">
> > <doc>
> > <str name="body">...</str>
> > <str name="changed">2010-10-13T18:58:28</str>
> > <str name="created">2010-10-13T18:58:28</str>
> > <str name="entity">file</str>
> > <str name="hash">hvvzxf</str>
> > <str name="id">hvvzxf/file/53-623</str>
> > <arr name="im_vid_9">...</arr>
> > <str name="language">und</str>
> > <str name="name"/>
> > <str name="nid">623</str>
> > <str name="path">sites/default/files/HomePage.pdf</str>
> > <str name="promote">F</str>
> > <str name="site">http://www.mysite.com/</str>
> > <str name="sm_facetbuilder_solr_type">solr_type:facet_3</str>
> > <arr name="sm_vid_Project_Type">...</arr>
> > <arr name="spell">...</arr>
> > <str name="ss_file_node_title">Training Test 2</str>
> > <str name="ss_file_node_url">http://www.mysite.com/training-test-2</str>
> > <str name="ss_filemime">application/pdf</str>
> > <str name="status">T</str>
> > <str name="sticky">F</str>
> > <str name="teaser">...</str>
> > <arr name="tid">...</arr>
> > <str name="timestamp">2012-11-28T05:05:52.623</str>
> > <str name="title">HomePage.pdf</str>
> > <str name="ts_vid_9_names">Construction Professional Services</str>
> > <str name="uid">0</str>
> > <str name="url">...</str>
> > <arr name="vid">...</arr>
> > </doc>
> >
> > When I do a solr query as "content:water" I get less fields in the
> results:
> > <result name="response" numFound="177" start="0">
> > <doc>
> > <float name="boost">0.027676692</float>
> > <str name="digest">4872e938706f9bee4d928330e5713623</str>
> > <str name="id">http://www.mysite.com/es/biographies</str>
> > <str name="segment">20161003150513</str>
> > <str name="title">Biographies</str>
> > <date name="tstamp">2016-10-03T15:21:45.346Z</date>
> > <str name="url">http://www.mysite.com/es/biographies</str>
> > </doc>
> >
> > Why is that?
> >
> >
> > Thanks,
> >
> >
> >
> > --
> > View this message in context: http://lucene.472066.n3.
> nabble.com/why-the-results-have-diff-number-of-fields-tp4299378.html
> > Sent from the Nutch - User mailing list archive at Nabble.com.
> >
>



-- 
Né§t☼r  *Authority gone to one's head is the greatest enemy of Truth*

Reply via email to