Any luck figuring this out? Karl On Wed, Apr 1, 2015 at 1:01 PM, Karl Wright <[email protected]> wrote:
> The button works fine. So the problem must be on the repository side. > > Karl > > > On Wed, Apr 1, 2015 at 12:56 PM, Karl Wright <[email protected]> wrote: > >> If your simple history shows no documents being processed or indexed, >> then that's the problem, or at least one of them. >> >> I will try to confirm that the reindex button still works as it should. >> >> Karl >> >> >> On Wed, Apr 1, 2015 at 12:43 PM, Kamil Żyta <[email protected]> >> wrote: >> >>> On Wed, Apr 01, 2015 at 12:07:47PM -0400, Karl Wright wrote: >>> > Hi Kamil, >>> > >>> > If no attempts are being made to actually index documents, then no >>> > documents will be indexed. >>> > >>> > (1) What repository connection is this? Can you try something simple >>> > first, like indexing from the file system? >>> >>> I use cifs, in 'Status and Job Management' Documents/Processed is 2598 >>> so I think he can reach files but I can try with 'File systems' >>> connector. >>> >>> > (2) I have confirmed that changing the collection does NOT trigger >>> > reindexing of documents. That is a bug, but you can work around it by >>> > clicking the "Reindex all documents" button on the output connection's >>> view >>> > page after every change to the collection name. Did you click that >>> button? >>> >>> yes, I clicked that button many times. >>> >>> K >>> >>> > >>> > >>> > On Wed, Apr 1, 2015 at 11:50 AM, Kamil Żyta <[email protected]> >>> wrote: >>> > >>> > > I see only start/access/stop activities. Access denied is normal in >>> my >>> > > setup. >>> > > So how can I debug the problem? >>> > > >>> > > K >>> > > >>> > > On Wed, Apr 01, 2015 at 08:32:42AM -0700, Karl Wright wrote: >>> > > > Hi Kamil, >>> > > > Can you look at the simple history report, to verify whether >>> manifoldcf >>> > > > is even attempting to post documents? It is possible that the solr >>> > > > connector doesn't count a change in collection name as requiring a >>> > > > reindex. >>> > > > >>> > > > Karl >>> > > > >>> > > > Sent from my Windows Phone >>> > > > From: Kamil Żyta >>> > > > Sent: 4/1/2015 11:08 AM >>> > > > To: [email protected] >>> > > > Subject: Re: MCF 2 and Solr Cloud 5 >>> > > > I created new collection in solr, configure mcf for this >>> collection: >>> > > > 'Connection working' but I cannot see any /update request from mcf >>> in >>> > > > solr, only: >>> > > > >>> > > > INFO - 2015-04-01 15:03:16.442; >>> > > > org.apache.solr.update.DirectUpdateHandler2; start >>> > > > >>> > > >>> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} >>> > > > INFO - 2015-04-01 15:03:16.444; >>> > > > org.apache.solr.update.DirectUpdateHandler2; No uncommitted >>> changes. >>> > > > Skipping IW.commit. >>> > > > INFO - 2015-04-01 15:03:16.445; org.apache.solr.core.SolrCore; >>> > > > SolrIndexSearcher has not changed - not re-opening: >>> > > > org.apache.solr.search.SolrIndexSearcher >>> > > > INFO - 2015-04-01 15:03:16.445; >>> > > > org.apache.solr.update.DirectUpdateHandler2; end_commit_flush >>> > > > INFO - 2015-04-01 15:03:16.445; >>> > > > org.apache.solr.update.processor.LogUpdateProcessor; >>> > > > [dysk_shard1_replica1] webapp=/solr path=/update >>> > > > >>> > > >>> params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&waitSearcher=true&openS >>> > > > earcher=true&commit=true&softCommit=false&distrib.from= >>> > > >>> http://10.26.26.29:8983/solr/dysk_shard2_replica1/&commit_end_point=true&wt=javabin&version=2&expungeDeletes=false >>> > > } >>> > > > {commit=} 0 3 >>> > > > INFO - 2015-04-01 15:03:16.448; >>> > > > org.apache.solr.update.DirectUpdateHandler2; start >>> > > > >>> > > >>> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} >>> > > > INFO - 2015-04-01 15:03:16.449; >>> > > > org.apache.solr.update.DirectUpdateHandler2; No uncommitted >>> changes. >>> > > > Skipping IW.commit. >>> > > > INFO - 2015-04-01 15:03:16.449; org.apache.solr.core.SolrCore; >>> > > > SolrIndexSearcher has not changed - not re-opening: >>> > > > org.apache.solr.search.SolrIndexSearcher >>> > > > INFO - 2015-04-01 15:03:16.450; >>> > > > org.apache.solr.update.DirectUpdateHandler2; end_commit_flush >>> > > > INFO - 2015-04-01 15:03:16.450; >>> > > > org.apache.solr.update.processor.LogUpdateProcessor; >>> > > > [dysk_shard2_replica1] webapp=/solr path=/update >>> > > > >>> > > >>> params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&waitSearcher=true&openS >>> > > > earcher=true&commit=true&softCommit=false&distrib.from= >>> > > >>> http://10.26.26.29:8983/solr/dysk_shard2_replica1/&commit_end_point=true&wt=javabin&version=2&expungeDeletes=false >>> > > } >>> > > > {commit=} 0 2 >>> > > > INFO - 2015-04-01 15:03:16.456; >>> > > > org.apache.solr.update.processor.LogUpdateProcessor; >>> > > > [dysk_shard2_replica1] webapp=/solr path=/update/extract >>> > > > params={commit=true&wt=javabin&version=2} {commit=} 0 21 >>> > > > >>> > > > K >>> > > > >>> > > > On Wed, Apr 01, 2015 at 10:53:39AM -0400, Karl Wright wrote: >>> > > > > "When I put 'esci' as collection name I get a error. >>> > > > > When I put 'collection1' I get 'Connection working' and no >>> errors in >>> > > logs >>> > > > > but >>> > > > > still no docs in solr." >>> > > > > >>> > > > > Hi Kamil, >>> > > > > Do you get the exception when you use "collection1" as the >>> collection >>> > > > > name? If not, then here's what I recommend: >>> > > > > >>> > > > > (1) Look at the Solr logs. There should be an INFO message for >>> each >>> > > > > document posted. There is a URL in the message, and a document >>> > > length, and >>> > > > > a result. It would be great if you could include a couple of >>> these >>> > > for us >>> > > > > to look at. >>> > > > > >>> > > > > (2) If there are any exceptions etc. in the Solr logs, please >>> send >>> > > those >>> > > > > along as well. >>> > > > > >>> > > > > Offhand, this sounds like documents get posted properly but then >>> > > ignored by >>> > > > > Solr. There are a lot of potential reasons why that could be >>> the case. >>> > > > > But if the documents are getting ignored, or if Tika is not >>> > > successfully >>> > > > > extracting data, then we should be able to figure out why based >>> on the >>> > > Solr >>> > > > > logs. >>> > > > > >>> > > > > Thanks, >>> > > > > Karl >>> > > > > >>> > > > > >>> > > > > >>> > > > > On Wed, Apr 1, 2015 at 10:39 AM, Kamil Żyta < >>> [email protected]> >>> > > wrote: >>> > > > > >>> > > > > > Ok, see my first mail. When I put 'esci' as collection name I >>> get a >>> > > error. >>> > > > > > When I put 'collection1' I get 'Connection working' and no >>> errors in >>> > > logs >>> > > > > > but >>> > > > > > still no docs in solr. >>> > > > > > >>> > > > > > K >>> > > > > > >>> > > > > > On Wed, Apr 01, 2015 at 10:27:50AM -0400, Karl Wright wrote: >>> > > > > > > Hi Kamil, >>> > > > > > > >>> > > > > > > This is happening on the commit. It looks to me like it's >>> because >>> > > you >>> > > > > > are >>> > > > > > > specifying a collection that doesn't actually exist: >>> > > > > > > >>> > > > > > > >>>>>> >>> > > > > > > DocCollection col = getDocCollection(clusterState, >>> collection); >>> > > > > > > >>> > > > > > > DocRouter router = col.getRouter(); >>> > > > > > > <<<<<< >>> > > > > > > >>> > > > > > > It's complaining because "col" is coming back null. >>> > > > > > > >>> > > > > > > Karl >>> > > > > > > >>> > > > > > > >>> > > > > > > On Wed, Apr 1, 2015 at 10:19 AM, Kamil Żyta < >>> [email protected] >>> > > > >>> > > > > > wrote: >>> > > > > > > >>> > > > > > > > ERROR 2015-04-01 16:09:24,032 (Job notification thread) - >>> > > Unhandled >>> > > > > > > > SolrServerException: java.lang.NullPointerException >>> > > > > > > > org.apache.manifoldcf.core.interfaces.ManifoldCFException: >>> > > Unhandled >>> > > > > > > > SolrServerException: java.lang.NullPointerException >>> > > > > > > > at >>> > > > > > > > >>> > > > > > >>> > > >>> org.apache.manifoldcf.agents.output.solr.HttpPoster.handleSolrServerException(HttpPoster.java:364) >>> > > > > > > > at >>> > > > > > > > >>> > > > > > >>> > > >>> org.apache.manifoldcf.agents.output.solr.HttpPoster.commitPost(HttpPoster.java:308) >>> > > > > > > > at >>> > > > > > > > >>> > > > > > >>> > > >>> org.apache.manifoldcf.agents.output.solr.SolrConnector.noteJobComplete(SolrConnector.java:610) >>> > > > > > > > at >>> > > > > > > > >>> > > > > > >>> > > >>> org.apache.manifoldcf.crawler.system.JobNotificationThread.run(JobNotificationThread.java:121) >>> > > > > > > > Caused by: >>> org.apache.solr.client.solrj.SolrServerException: >>> > > > > > > > java.lang.NullPointerException >>> > > > > > > > at >>> > > > > > > > >>> > > > > > >>> > > >>> org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:873) >>> > > > > > > > at >>> > > > > > > > >>> > > > > > >>> > > >>> org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:738) >>> > > > > > > > at >>> > > > > > > > >>> > > > > > >>> > > >>> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124) >>> > > > > > > > at >>> > > > > > > > >>> > > > > > >>> > > >>> org.apache.manifoldcf.agents.output.solr.HttpPoster$CommitThread.run(HttpPoster.java:1372) >>> > > > > > > > Caused by: java.lang.NullPointerException >>> > > > > > > > at >>> > > > > > > > >>> > > > > > >>> > > >>> org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:520) >>> > > > > > > > at >>> > > > > > > > >>> > > > > > >>> > > >>> org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:892) >>> > > > > > > > at >>> > > > > > > > >>> > > > > > >>> > > >>> org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:795) >>> > > > > > > > ... 3 more >>> > > > > > > > >>> > > > > > > > K >>> > > > > > > > >>> > > > > > > > On Wed, Apr 01, 2015 at 10:15:13AM -0400, Karl Wright >>> wrote: >>> > > > > > > > > Hi Kamil, >>> > > > > > > > > >>> > > > > > > > > So you are still seeing a NullPointerException from >>> > > > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient? Can >>> you >>> > > provide >>> > > > > > the >>> > > > > > > > > entire stack trace? >>> > > > > > > > > >>> > > > > > > > > Karl >>> > > > > > > > > >>> > > > > > > > > >>> > > > > > > > > On Wed, Apr 1, 2015 at 10:10 AM, Kamil Żyta < >>> > > [email protected]> >>> > > > > > > > wrote: >>> > > > > > > > > >>> > > > > > > > > > Hi Karl, >>> > > > > > > > > > same thing with trunk. Any advice? >>> > > > > > > > > > >>> > > > > > > > > > K >>> > > > > > > > > > >>> > > > > > > > > > On Wed, Apr 01, 2015 at 09:37:47AM -0400, Karl Wright >>> wrote: >>> > > > > > > > > > > Hi Kamil, >>> > > > > > > > > > > >>> > > > > > > > > > > Solrj 5.0 changed massively from Solrj 4.x. The >>> work to >>> > > use >>> > > > > > Solrj >>> > > > > > > > 5.0 >>> > > > > > > > > > has >>> > > > > > > > > > > been done on trunk. You will need to check out and >>> build >>> > > trunk >>> > > > > > in >>> > > > > > > > order >>> > > > > > > > > > to >>> > > > > > > > > > > use Solr 5. >>> > > > > > > > > > > >>> > > > > > > > > > > Thanks, >>> > > > > > > > > > > Karl >>> > > > > > > > > > > >>> > > > > > > > > > > On Wed, Apr 1, 2015 at 9:23 AM, Kamil Żyta < >>> > > > > > [email protected]> >>> > > > > > > > > > wrote: >>> > > > > > > > > > > >>> > > > > > > > > > > > Hi, >>> > > > > > > > > > > > I set up solr 5 (Cloud) and mcf2, created core in >>> solr >>> > > with 2 >>> > > > > > > > shards >>> > > > > > > > > > and 2 >>> > > > > > > > > > > > replicas: >>> > > > > > > > > > > > https://i.imgur.com/M05QTu7.png and created Output >>> > > > > > Connections in >>> > > > > > > > mcf. >>> > > > > > > > > > > > When I put 'esci' in 'Collection name' I got error: >>> > > > > > > > > > > > Threw exception: 'Unhandled SolrServerException: >>> No live >>> > > > > > > > SolrServers >>> > > > > > > > > > > > available to handle this request:[ >>> > > > > > > > http://10.26.26.29:8983/solr/esci, >>> > > > > > > > > > > > http://10.26.26.28:8983/solr/esci]' >>> > > > > > > > > > > > When I leave 'Collection name' empty I have >>> 'Connection >>> > > > > > working'. >>> > > > > > > > > > > > Now when I start job, everything look good, worker >>> fetch >>> > > docs, >>> > > > > > etc >>> > > > > > > > > > > > but I cannot see any docs in solr. Nothing in logs >>> > > except one >>> > > > > > line >>> > > > > > > > in >>> > > > > > > > > > > > worker >>> > > > > > > > > > > > console: >>> > > > > > > > > > > > [Thread-6476596] ERROR >>> > > > > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient - >>> > > > > > > > > > > > Request to collection failed due to (0) >>> > > > > > > > > > java.lang.NullPointerException, >>> > > > > > > > > > > > retry? 0 >>> > > > > > > > > > > > thanks for the advice. >>> > > > > > > > > > > > >>> > > > > > > > > > > > K >>> > > > > > > > > > > > >>> > > > > > > > > > > > >>> > > > > > > > > > >>> > > > > > > > >>> > > > > > >>> > > >>> >> >> >
