Hi Kamil, If no attempts are being made to actually index documents, then no documents will be indexed.
(1) What repository connection is this? Can you try something simple first, like indexing from the file system? (2) I have confirmed that changing the collection does NOT trigger reindexing of documents. That is a bug, but you can work around it by clicking the "Reindex all documents" button on the output connection's view page after every change to the collection name. Did you click that button? Karl On Wed, Apr 1, 2015 at 11:50 AM, Kamil Żyta <[email protected]> wrote: > I see only start/access/stop activities. Access denied is normal in my > setup. > So how can I debug the problem? > > K > > On Wed, Apr 01, 2015 at 08:32:42AM -0700, Karl Wright wrote: > > Hi Kamil, > > Can you look at the simple history report, to verify whether manifoldcf > > is even attempting to post documents? It is possible that the solr > > connector doesn't count a change in collection name as requiring a > > reindex. > > > > Karl > > > > Sent from my Windows Phone > > From: Kamil Żyta > > Sent: 4/1/2015 11:08 AM > > To: [email protected] > > Subject: Re: MCF 2 and Solr Cloud 5 > > I created new collection in solr, configure mcf for this collection: > > 'Connection working' but I cannot see any /update request from mcf in > > solr, only: > > > > INFO - 2015-04-01 15:03:16.442; > > org.apache.solr.update.DirectUpdateHandler2; start > > > commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} > > INFO - 2015-04-01 15:03:16.444; > > org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. > > Skipping IW.commit. > > INFO - 2015-04-01 15:03:16.445; org.apache.solr.core.SolrCore; > > SolrIndexSearcher has not changed - not re-opening: > > org.apache.solr.search.SolrIndexSearcher > > INFO - 2015-04-01 15:03:16.445; > > org.apache.solr.update.DirectUpdateHandler2; end_commit_flush > > INFO - 2015-04-01 15:03:16.445; > > org.apache.solr.update.processor.LogUpdateProcessor; > > [dysk_shard1_replica1] webapp=/solr path=/update > > > params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&waitSearcher=true&openS > > earcher=true&commit=true&softCommit=false&distrib.from= > http://10.26.26.29:8983/solr/dysk_shard2_replica1/&commit_end_point=true&wt=javabin&version=2&expungeDeletes=false > } > > {commit=} 0 3 > > INFO - 2015-04-01 15:03:16.448; > > org.apache.solr.update.DirectUpdateHandler2; start > > > commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} > > INFO - 2015-04-01 15:03:16.449; > > org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. > > Skipping IW.commit. > > INFO - 2015-04-01 15:03:16.449; org.apache.solr.core.SolrCore; > > SolrIndexSearcher has not changed - not re-opening: > > org.apache.solr.search.SolrIndexSearcher > > INFO - 2015-04-01 15:03:16.450; > > org.apache.solr.update.DirectUpdateHandler2; end_commit_flush > > INFO - 2015-04-01 15:03:16.450; > > org.apache.solr.update.processor.LogUpdateProcessor; > > [dysk_shard2_replica1] webapp=/solr path=/update > > > params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&waitSearcher=true&openS > > earcher=true&commit=true&softCommit=false&distrib.from= > http://10.26.26.29:8983/solr/dysk_shard2_replica1/&commit_end_point=true&wt=javabin&version=2&expungeDeletes=false > } > > {commit=} 0 2 > > INFO - 2015-04-01 15:03:16.456; > > org.apache.solr.update.processor.LogUpdateProcessor; > > [dysk_shard2_replica1] webapp=/solr path=/update/extract > > params={commit=true&wt=javabin&version=2} {commit=} 0 21 > > > > K > > > > On Wed, Apr 01, 2015 at 10:53:39AM -0400, Karl Wright wrote: > > > "When I put 'esci' as collection name I get a error. > > > When I put 'collection1' I get 'Connection working' and no errors in > logs > > > but > > > still no docs in solr." > > > > > > Hi Kamil, > > > Do you get the exception when you use "collection1" as the collection > > > name? If not, then here's what I recommend: > > > > > > (1) Look at the Solr logs. There should be an INFO message for each > > > document posted. There is a URL in the message, and a document > length, and > > > a result. It would be great if you could include a couple of these > for us > > > to look at. > > > > > > (2) If there are any exceptions etc. in the Solr logs, please send > those > > > along as well. > > > > > > Offhand, this sounds like documents get posted properly but then > ignored by > > > Solr. There are a lot of potential reasons why that could be the case. > > > But if the documents are getting ignored, or if Tika is not > successfully > > > extracting data, then we should be able to figure out why based on the > Solr > > > logs. > > > > > > Thanks, > > > Karl > > > > > > > > > > > > On Wed, Apr 1, 2015 at 10:39 AM, Kamil Żyta <[email protected]> > wrote: > > > > > > > Ok, see my first mail. When I put 'esci' as collection name I get a > error. > > > > When I put 'collection1' I get 'Connection working' and no errors in > logs > > > > but > > > > still no docs in solr. > > > > > > > > K > > > > > > > > On Wed, Apr 01, 2015 at 10:27:50AM -0400, Karl Wright wrote: > > > > > Hi Kamil, > > > > > > > > > > This is happening on the commit. It looks to me like it's because > you > > > > are > > > > > specifying a collection that doesn't actually exist: > > > > > > > > > > >>>>>> > > > > > DocCollection col = getDocCollection(clusterState, collection); > > > > > > > > > > DocRouter router = col.getRouter(); > > > > > <<<<<< > > > > > > > > > > It's complaining because "col" is coming back null. > > > > > > > > > > Karl > > > > > > > > > > > > > > > On Wed, Apr 1, 2015 at 10:19 AM, Kamil Żyta <[email protected] > > > > > > wrote: > > > > > > > > > > > ERROR 2015-04-01 16:09:24,032 (Job notification thread) - > Unhandled > > > > > > SolrServerException: java.lang.NullPointerException > > > > > > org.apache.manifoldcf.core.interfaces.ManifoldCFException: > Unhandled > > > > > > SolrServerException: java.lang.NullPointerException > > > > > > at > > > > > > > > > > > org.apache.manifoldcf.agents.output.solr.HttpPoster.handleSolrServerException(HttpPoster.java:364) > > > > > > at > > > > > > > > > > > org.apache.manifoldcf.agents.output.solr.HttpPoster.commitPost(HttpPoster.java:308) > > > > > > at > > > > > > > > > > > org.apache.manifoldcf.agents.output.solr.SolrConnector.noteJobComplete(SolrConnector.java:610) > > > > > > at > > > > > > > > > > > org.apache.manifoldcf.crawler.system.JobNotificationThread.run(JobNotificationThread.java:121) > > > > > > Caused by: org.apache.solr.client.solrj.SolrServerException: > > > > > > java.lang.NullPointerException > > > > > > at > > > > > > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:873) > > > > > > at > > > > > > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:738) > > > > > > at > > > > > > > > > > > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124) > > > > > > at > > > > > > > > > > > org.apache.manifoldcf.agents.output.solr.HttpPoster$CommitThread.run(HttpPoster.java:1372) > > > > > > Caused by: java.lang.NullPointerException > > > > > > at > > > > > > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:520) > > > > > > at > > > > > > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:892) > > > > > > at > > > > > > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:795) > > > > > > ... 3 more > > > > > > > > > > > > K > > > > > > > > > > > > On Wed, Apr 01, 2015 at 10:15:13AM -0400, Karl Wright wrote: > > > > > > > Hi Kamil, > > > > > > > > > > > > > > So you are still seeing a NullPointerException from > > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient? Can you > provide > > > > the > > > > > > > entire stack trace? > > > > > > > > > > > > > > Karl > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 1, 2015 at 10:10 AM, Kamil Żyta < > [email protected]> > > > > > > wrote: > > > > > > > > > > > > > > > Hi Karl, > > > > > > > > same thing with trunk. Any advice? > > > > > > > > > > > > > > > > K > > > > > > > > > > > > > > > > On Wed, Apr 01, 2015 at 09:37:47AM -0400, Karl Wright wrote: > > > > > > > > > Hi Kamil, > > > > > > > > > > > > > > > > > > Solrj 5.0 changed massively from Solrj 4.x. The work to > use > > > > Solrj > > > > > > 5.0 > > > > > > > > has > > > > > > > > > been done on trunk. You will need to check out and build > trunk > > > > in > > > > > > order > > > > > > > > to > > > > > > > > > use Solr 5. > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > Karl > > > > > > > > > > > > > > > > > > On Wed, Apr 1, 2015 at 9:23 AM, Kamil Żyta < > > > > [email protected]> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > I set up solr 5 (Cloud) and mcf2, created core in solr > with 2 > > > > > > shards > > > > > > > > and 2 > > > > > > > > > > replicas: > > > > > > > > > > https://i.imgur.com/M05QTu7.png and created Output > > > > Connections in > > > > > > mcf. > > > > > > > > > > When I put 'esci' in 'Collection name' I got error: > > > > > > > > > > Threw exception: 'Unhandled SolrServerException: No live > > > > > > SolrServers > > > > > > > > > > available to handle this request:[ > > > > > > http://10.26.26.29:8983/solr/esci, > > > > > > > > > > http://10.26.26.28:8983/solr/esci]' > > > > > > > > > > When I leave 'Collection name' empty I have 'Connection > > > > working'. > > > > > > > > > > Now when I start job, everything look good, worker fetch > docs, > > > > etc > > > > > > > > > > but I cannot see any docs in solr. Nothing in logs > except one > > > > line > > > > > > in > > > > > > > > > > worker > > > > > > > > > > console: > > > > > > > > > > [Thread-6476596] ERROR > > > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient - > > > > > > > > > > Request to collection failed due to (0) > > > > > > > > java.lang.NullPointerException, > > > > > > > > > > retry? 0 > > > > > > > > > > thanks for the advice. > > > > > > > > > > > > > > > > > > > > K > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
