On Wed, Apr 01, 2015 at 12:07:47PM -0400, Karl Wright wrote: > Hi Kamil, > > If no attempts are being made to actually index documents, then no > documents will be indexed. > > (1) What repository connection is this? Can you try something simple > first, like indexing from the file system?
I use cifs, in 'Status and Job Management' Documents/Processed is 2598 so I think he can reach files but I can try with 'File systems' connector. > (2) I have confirmed that changing the collection does NOT trigger > reindexing of documents. That is a bug, but you can work around it by > clicking the "Reindex all documents" button on the output connection's view > page after every change to the collection name. Did you click that button? yes, I clicked that button many times. K > > > On Wed, Apr 1, 2015 at 11:50 AM, Kamil Żyta <[email protected]> wrote: > > > I see only start/access/stop activities. Access denied is normal in my > > setup. > > So how can I debug the problem? > > > > K > > > > On Wed, Apr 01, 2015 at 08:32:42AM -0700, Karl Wright wrote: > > > Hi Kamil, > > > Can you look at the simple history report, to verify whether manifoldcf > > > is even attempting to post documents? It is possible that the solr > > > connector doesn't count a change in collection name as requiring a > > > reindex. > > > > > > Karl > > > > > > Sent from my Windows Phone > > > From: Kamil Żyta > > > Sent: 4/1/2015 11:08 AM > > > To: [email protected] > > > Subject: Re: MCF 2 and Solr Cloud 5 > > > I created new collection in solr, configure mcf for this collection: > > > 'Connection working' but I cannot see any /update request from mcf in > > > solr, only: > > > > > > INFO - 2015-04-01 15:03:16.442; > > > org.apache.solr.update.DirectUpdateHandler2; start > > > > > commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} > > > INFO - 2015-04-01 15:03:16.444; > > > org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. > > > Skipping IW.commit. > > > INFO - 2015-04-01 15:03:16.445; org.apache.solr.core.SolrCore; > > > SolrIndexSearcher has not changed - not re-opening: > > > org.apache.solr.search.SolrIndexSearcher > > > INFO - 2015-04-01 15:03:16.445; > > > org.apache.solr.update.DirectUpdateHandler2; end_commit_flush > > > INFO - 2015-04-01 15:03:16.445; > > > org.apache.solr.update.processor.LogUpdateProcessor; > > > [dysk_shard1_replica1] webapp=/solr path=/update > > > > > params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&waitSearcher=true&openS > > > earcher=true&commit=true&softCommit=false&distrib.from= > > http://10.26.26.29:8983/solr/dysk_shard2_replica1/&commit_end_point=true&wt=javabin&version=2&expungeDeletes=false > > } > > > {commit=} 0 3 > > > INFO - 2015-04-01 15:03:16.448; > > > org.apache.solr.update.DirectUpdateHandler2; start > > > > > commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} > > > INFO - 2015-04-01 15:03:16.449; > > > org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. > > > Skipping IW.commit. > > > INFO - 2015-04-01 15:03:16.449; org.apache.solr.core.SolrCore; > > > SolrIndexSearcher has not changed - not re-opening: > > > org.apache.solr.search.SolrIndexSearcher > > > INFO - 2015-04-01 15:03:16.450; > > > org.apache.solr.update.DirectUpdateHandler2; end_commit_flush > > > INFO - 2015-04-01 15:03:16.450; > > > org.apache.solr.update.processor.LogUpdateProcessor; > > > [dysk_shard2_replica1] webapp=/solr path=/update > > > > > params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&waitSearcher=true&openS > > > earcher=true&commit=true&softCommit=false&distrib.from= > > http://10.26.26.29:8983/solr/dysk_shard2_replica1/&commit_end_point=true&wt=javabin&version=2&expungeDeletes=false > > } > > > {commit=} 0 2 > > > INFO - 2015-04-01 15:03:16.456; > > > org.apache.solr.update.processor.LogUpdateProcessor; > > > [dysk_shard2_replica1] webapp=/solr path=/update/extract > > > params={commit=true&wt=javabin&version=2} {commit=} 0 21 > > > > > > K > > > > > > On Wed, Apr 01, 2015 at 10:53:39AM -0400, Karl Wright wrote: > > > > "When I put 'esci' as collection name I get a error. > > > > When I put 'collection1' I get 'Connection working' and no errors in > > logs > > > > but > > > > still no docs in solr." > > > > > > > > Hi Kamil, > > > > Do you get the exception when you use "collection1" as the collection > > > > name? If not, then here's what I recommend: > > > > > > > > (1) Look at the Solr logs. There should be an INFO message for each > > > > document posted. There is a URL in the message, and a document > > length, and > > > > a result. It would be great if you could include a couple of these > > for us > > > > to look at. > > > > > > > > (2) If there are any exceptions etc. in the Solr logs, please send > > those > > > > along as well. > > > > > > > > Offhand, this sounds like documents get posted properly but then > > ignored by > > > > Solr. There are a lot of potential reasons why that could be the case. > > > > But if the documents are getting ignored, or if Tika is not > > successfully > > > > extracting data, then we should be able to figure out why based on the > > Solr > > > > logs. > > > > > > > > Thanks, > > > > Karl > > > > > > > > > > > > > > > > On Wed, Apr 1, 2015 at 10:39 AM, Kamil Żyta <[email protected]> > > wrote: > > > > > > > > > Ok, see my first mail. When I put 'esci' as collection name I get a > > error. > > > > > When I put 'collection1' I get 'Connection working' and no errors in > > logs > > > > > but > > > > > still no docs in solr. > > > > > > > > > > K > > > > > > > > > > On Wed, Apr 01, 2015 at 10:27:50AM -0400, Karl Wright wrote: > > > > > > Hi Kamil, > > > > > > > > > > > > This is happening on the commit. It looks to me like it's because > > you > > > > > are > > > > > > specifying a collection that doesn't actually exist: > > > > > > > > > > > > >>>>>> > > > > > > DocCollection col = getDocCollection(clusterState, collection); > > > > > > > > > > > > DocRouter router = col.getRouter(); > > > > > > <<<<<< > > > > > > > > > > > > It's complaining because "col" is coming back null. > > > > > > > > > > > > Karl > > > > > > > > > > > > > > > > > > On Wed, Apr 1, 2015 at 10:19 AM, Kamil Żyta <[email protected] > > > > > > > > wrote: > > > > > > > > > > > > > ERROR 2015-04-01 16:09:24,032 (Job notification thread) - > > Unhandled > > > > > > > SolrServerException: java.lang.NullPointerException > > > > > > > org.apache.manifoldcf.core.interfaces.ManifoldCFException: > > Unhandled > > > > > > > SolrServerException: java.lang.NullPointerException > > > > > > > at > > > > > > > > > > > > > > org.apache.manifoldcf.agents.output.solr.HttpPoster.handleSolrServerException(HttpPoster.java:364) > > > > > > > at > > > > > > > > > > > > > > org.apache.manifoldcf.agents.output.solr.HttpPoster.commitPost(HttpPoster.java:308) > > > > > > > at > > > > > > > > > > > > > > org.apache.manifoldcf.agents.output.solr.SolrConnector.noteJobComplete(SolrConnector.java:610) > > > > > > > at > > > > > > > > > > > > > > org.apache.manifoldcf.crawler.system.JobNotificationThread.run(JobNotificationThread.java:121) > > > > > > > Caused by: org.apache.solr.client.solrj.SolrServerException: > > > > > > > java.lang.NullPointerException > > > > > > > at > > > > > > > > > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:873) > > > > > > > at > > > > > > > > > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:738) > > > > > > > at > > > > > > > > > > > > > > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124) > > > > > > > at > > > > > > > > > > > > > > org.apache.manifoldcf.agents.output.solr.HttpPoster$CommitThread.run(HttpPoster.java:1372) > > > > > > > Caused by: java.lang.NullPointerException > > > > > > > at > > > > > > > > > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:520) > > > > > > > at > > > > > > > > > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:892) > > > > > > > at > > > > > > > > > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:795) > > > > > > > ... 3 more > > > > > > > > > > > > > > K > > > > > > > > > > > > > > On Wed, Apr 01, 2015 at 10:15:13AM -0400, Karl Wright wrote: > > > > > > > > Hi Kamil, > > > > > > > > > > > > > > > > So you are still seeing a NullPointerException from > > > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient? Can you > > provide > > > > > the > > > > > > > > entire stack trace? > > > > > > > > > > > > > > > > Karl > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 1, 2015 at 10:10 AM, Kamil Żyta < > > [email protected]> > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi Karl, > > > > > > > > > same thing with trunk. Any advice? > > > > > > > > > > > > > > > > > > K > > > > > > > > > > > > > > > > > > On Wed, Apr 01, 2015 at 09:37:47AM -0400, Karl Wright wrote: > > > > > > > > > > Hi Kamil, > > > > > > > > > > > > > > > > > > > > Solrj 5.0 changed massively from Solrj 4.x. The work to > > use > > > > > Solrj > > > > > > > 5.0 > > > > > > > > > has > > > > > > > > > > been done on trunk. You will need to check out and build > > trunk > > > > > in > > > > > > > order > > > > > > > > > to > > > > > > > > > > use Solr 5. > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Karl > > > > > > > > > > > > > > > > > > > > On Wed, Apr 1, 2015 at 9:23 AM, Kamil Żyta < > > > > > [email protected]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > I set up solr 5 (Cloud) and mcf2, created core in solr > > with 2 > > > > > > > shards > > > > > > > > > and 2 > > > > > > > > > > > replicas: > > > > > > > > > > > https://i.imgur.com/M05QTu7.png and created Output > > > > > Connections in > > > > > > > mcf. > > > > > > > > > > > When I put 'esci' in 'Collection name' I got error: > > > > > > > > > > > Threw exception: 'Unhandled SolrServerException: No live > > > > > > > SolrServers > > > > > > > > > > > available to handle this request:[ > > > > > > > http://10.26.26.29:8983/solr/esci, > > > > > > > > > > > http://10.26.26.28:8983/solr/esci]' > > > > > > > > > > > When I leave 'Collection name' empty I have 'Connection > > > > > working'. > > > > > > > > > > > Now when I start job, everything look good, worker fetch > > docs, > > > > > etc > > > > > > > > > > > but I cannot see any docs in solr. Nothing in logs > > except one > > > > > line > > > > > > > in > > > > > > > > > > > worker > > > > > > > > > > > console: > > > > > > > > > > > [Thread-6476596] ERROR > > > > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient - > > > > > > > > > > > Request to collection failed due to (0) > > > > > > > > > java.lang.NullPointerException, > > > > > > > > > > > retry? 0 > > > > > > > > > > > thanks for the advice. > > > > > > > > > > > > > > > > > > > > > > K > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
