Hi Sherban,

On Wed, Sep 30, 2015 at 6:46 AM, <[email protected]> wrote:

>
> I tried with SOLR 4.9.1.
>

OK. As I said Solr 4.6 is supported but never mind.


>
> I copied /release-2.3.1/runtime/local/conf/schema.xml to
> solr-4.9.1/example/solr/collection1/conf/schema.xml
>

Good.


>
> Result of /release-2.3.1/runtime/local/bin/crawl urls method_centers
> http://localhost:8983/solr 2
>
>
> InjectorJob: total number of urls rejected by filters: 1
>

Notice that you regex urlfilter is rejecting one of your seed URLs.


> InjectorJob: total number of urls injected after normalization and
> filtering: 5
>

[...snip]

GeneratorJob: generated batch id: 1443556518-1067112789 containing 0 URLs
> Generate returned 1 (no new segments created)
> Escaping loop: no more URLs to fetch now
>
> There are 6 URLs in my urls/seeds.txt file. Why does it say 0 URLs?
>

1 was rejected as explained above. Additionally, it seems like there is
also an error fetching your seeds and parsing out hyperlinks. I would
encourage you to check the early stages of configuring and prepping your
crawler. Some configuration is incorrect... possibly more problems with
your regex urlfilters.


>
>
> The index job worked but there’s no data in SOLR. Is there a known good
> version of SOLR that works with 2.3.1 schema.xml? Are the tutorial
> instructions still valid?
>

Not it did not. It failed. Look at the hadoop.log.
Also please look at your solr.log, it will provide you with better insight
into what is wrong with your Solr server and what messages are failing.
Thanks

Reply via email to