[jira] [Created] (CONNECTORS-415) Minor errors in script engine documentation

2012-02-26 Thread Karl Wright (Created) (JIRA)
Minor errors in script engine documentation
---

 Key: CONNECTORS-415
 URL: https://issues.apache.org/jira/browse/CONNECTORS-415
 Project: ManifoldCF
  Issue Type: Bug
  Components: Documentation
Affects Versions: ManifoldCF 0.4, ManifoldCF 0.5
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Minor
 Fix For: ManifoldCF 0.5


There are a number of places where  and  are used instead of  and 
.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CONNECTORS-415) Minor errors in script engine documentation

2012-02-26 Thread Karl Wright (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-415.


Resolution: Fixed

r1293918


 Minor errors in script engine documentation
 ---

 Key: CONNECTORS-415
 URL: https://issues.apache.org/jira/browse/CONNECTORS-415
 Project: ManifoldCF
  Issue Type: Bug
  Components: Documentation
Affects Versions: ManifoldCF 0.4, ManifoldCF 0.5
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Minor
 Fix For: ManifoldCF 0.5


 There are a number of places where  and  are used instead of  and 
 .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CONNECTORS-416) Can't click any of the buttons on the authentication tab for the web connector

2012-02-26 Thread Karl Wright (Created) (JIRA)
Can't click any of the buttons on the authentication tab for the web connector
--

 Key: CONNECTORS-416
 URL: https://issues.apache.org/jira/browse/CONNECTORS-416
 Project: ManifoldCF
  Issue Type: Bug
  Components: Web connector
Affects Versions: ManifoldCF 0.5
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 0.5


The session and page authentication buttons do not work.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CONNECTORS-416) Can't click any of the buttons on the authentication tab for the web connector

2012-02-26 Thread Karl Wright (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-416.


Resolution: Fixed

r1293919


 Can't click any of the buttons on the authentication tab for the web connector
 --

 Key: CONNECTORS-416
 URL: https://issues.apache.org/jira/browse/CONNECTORS-416
 Project: ManifoldCF
  Issue Type: Bug
  Components: Web connector
Affects Versions: ManifoldCF 0.5
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 0.5


 The session and page authentication buttons do not work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

2012-02-26 Thread Luca Stancapiano (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216825#comment-13216825
 ] 

Luca Stancapiano commented on CONNECTORS-288:
-

An other thing that I can note is that the 
org.apache.manifoldcf.crawler.system.WorkerThread and the  
org.apache.manifoldcf.crawler.system.StartupThread are not active when the test 
start. I suppose they support the jobs, when they start  

 An ElasticSearch connector would be helpful
 ---

 Key: CONNECTORS-288
 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
 Project: ManifoldCF
  Issue Type: New Feature
Affects Versions: ManifoldCF 0.5
Reporter: Piergiorgio Lucidi
Assignee: Piergiorgio Lucidi
  Labels: elasticsearch
 Fix For: ManifoldCF next

 Attachments: manifold-elasticsearch-patch, 
 manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
 manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
 manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
 manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct

   Original Estimate: 120h
  Remaining Estimate: 120h

 An ElasticSearch connector could be very useful to spread the use of 
 ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

2012-02-26 Thread Karl Wright (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216827#comment-13216827
 ] 

Karl Wright commented on CONNECTORS-288:


From what I can see, the connector IS called, but it just throws an exception 
when it sets up its session.  I can instrument the connector if you like in 
order to prove this to you.

If you want to see this, just browse to localhost:8346/mcf-crawler-ui while the 
test is running.  View the output connection.  You will see the exception I've 
already reported.

WorkerThread and StartupThread will not become active until the agents process 
starts.  In a test, this happens during a @Before method.


 An ElasticSearch connector would be helpful
 ---

 Key: CONNECTORS-288
 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
 Project: ManifoldCF
  Issue Type: New Feature
Affects Versions: ManifoldCF 0.5
Reporter: Piergiorgio Lucidi
Assignee: Piergiorgio Lucidi
  Labels: elasticsearch
 Fix For: ManifoldCF next

 Attachments: manifold-elasticsearch-patch, 
 manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
 manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
 manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
 manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct

   Original Estimate: 120h
  Remaining Estimate: 120h

 An ElasticSearch connector could be very useful to spread the use of 
 ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

2012-02-26 Thread Karl Wright (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216830#comment-13216830
 ] 

Karl Wright commented on CONNECTORS-288:


Instrumentation yields the following:

[junit] org.apache.manifoldcf.core.interfaces.ManifoldCFException: Server/pa
ge not found
[junit] at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSear
chConnection.call(ElasticSearchConnection.java:111)
[junit] at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSear
chAction.init(ElasticSearchAction.java:37)
[junit] at org.apache.manifoldcf.agents.output.elasticsearch.ElasticSear
chConnector.check(ElasticSearchConnector.java:389)

I'm instrumenting the ElasticSearchAction constructor now to see what URL it 
thinks it is using.


 An ElasticSearch connector would be helpful
 ---

 Key: CONNECTORS-288
 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
 Project: ManifoldCF
  Issue Type: New Feature
Affects Versions: ManifoldCF 0.5
Reporter: Piergiorgio Lucidi
Assignee: Piergiorgio Lucidi
  Labels: elasticsearch
 Fix For: ManifoldCF next

 Attachments: manifold-elasticsearch-patch, 
 manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
 manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
 manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
 manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct

   Original Estimate: 120h
  Remaining Estimate: 120h

 An ElasticSearch connector could be very useful to spread the use of 
 ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

2012-02-26 Thread Karl Wright (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216840#comment-13216840
 ] 

Karl Wright commented on CONNECTORS-288:


So when I added a System.out.println of the URL in ElasticSearchAction, I no 
longer get any errors; the check() response is OK.  (That is wrong, by the 
way; it should return super.check() instead, which is Connection working.)

The instrumented URL output looks like this:

[junit] URL is 'http://localhost:9200/index/_optimize'
[junit] URL is 'http://localhost:9200/index/_status'
   [junit] URL is 'http://localhost:9200/index/_optimize'
   [junit] URL is 'http://localhost:9200/index/_optimize'
   [junit] URL is 'http://localhost:9200/index/_optimize'

... followed by the 12 ms timeout.

Some conclusions: (1) We should fix the check() method; (2) The fact that 
check() succeeds sometimes and fails others is quite disconcerting; clearly the 
connector is doing something pretty wrong.

I also looked more deeply at the code itself.  The addOrReplaceDocument() 
method uses a synchronizer to permit only one thread to index at a time.  This 
does not seem correct to me, and it is thus probable that the problem stems 
from improper understanding of the ManifoldCF threading model.  Each connector 
instance should be working with its own ElasticSearchIndex object and its own 
HttpClient method so that all of the threads can operate independently without 
collision.


 An ElasticSearch connector would be helpful
 ---

 Key: CONNECTORS-288
 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
 Project: ManifoldCF
  Issue Type: New Feature
Affects Versions: ManifoldCF 0.5
Reporter: Piergiorgio Lucidi
Assignee: Piergiorgio Lucidi
  Labels: elasticsearch
 Fix For: ManifoldCF next

 Attachments: manifold-elasticsearch-patch, 
 manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
 manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
 manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
 manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct

   Original Estimate: 120h
  Remaining Estimate: 120h

 An ElasticSearch connector could be very useful to spread the use of 
 ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

2012-02-26 Thread Karl Wright (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216851#comment-13216851
 ] 

Karl Wright commented on CONNECTORS-288:


Looking at the actual test run, the history reports the following at the end:

{code}
02-26-2012 16:09:25.129 Indexation (ElasticSearch)  
http://localhost:9200/index/_optimize
OK  0   69  
02-26-2012 16:09:24.939 job end 1330290457146(Test Job)
0   1   
02-26-2012 16:09:14.909 Deletion (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
OK  0   7   
02-26-2012 16:09:07.787 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/140/null
OK  27  6   
02-26-2012 16:09:07.778 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/139/null
OK  27  7   
02-26-2012 16:09:07.769 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/137/null
OK  27  15  
02-26-2012 16:09:05.278 job start   1330290457146(Test Job)
0   1   
02-26-2012 16:08:55.020 Indexation (ElasticSearch)  
http://localhost:9200/index/_optimize
OK  0   93  
02-26-2012 16:08:54.926 job end 1330290457146(Test Job)
0   1   
02-26-2012 16:08:47.678 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
OK  27  10  
02-26-2012 16:08:47.666 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/140/null
OK  27  6   
02-26-2012 16:08:47.652 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/139/null
OK  27  11  
02-26-2012 16:08:47.646 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/137/null
OK  27  13  
02-26-2012 16:08:45.192 job start   1330290457146(Test Job)
0   1   
02-26-2012 16:08:34.940 Indexation (ElasticSearch)  
http://localhost:9200/index/_optimize
OK  0   75  
02-26-2012 16:08:34.917 job end 1330290457146(Test Job)
0   1   
02-26-2012 16:08:29.502 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/139/null
OK  27  10  
02-26-2012 16:08:29.491 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/137/null
OK  27  8   
02-26-2012 16:08:29.412 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/140/null
OK  27  66  
02-26-2012 16:08:29.404 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
OK  27  68  
02-26-2012 16:08:25.097 job start   1330290457146(Test Job)
0   1   
02-26-2012 16:08:24.846 Indexation (ElasticSearch)  
http://localhost:9200/index/_optimize
OK  0   88  
02-26-2012 16:08:14.890 job stop1330290457146(Test Job)
0   1   
02-26-2012 16:08:09.041 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
OK  27  868 
02-26-2012 16:08:04.900 job start   1330290457146(Test Job)
0   1   
{code}

The job at the end is stuck in the Cleaning up state, which indicates that it 
is trying to delete the documents from the index, but is not succeeding for 
some reason.


 An ElasticSearch connector would be helpful
 ---

 Key: CONNECTORS-288
 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
 Project: ManifoldCF
  Issue Type: New Feature
Affects Versions: ManifoldCF 0.5
Reporter: Piergiorgio Lucidi
Assignee: Piergiorgio Lucidi
  Labels: elasticsearch
 Fix For: ManifoldCF next

 Attachments: manifold-elasticsearch-patch, 
 manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
 manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
 manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
 manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct

   Original Estimate: 120h
  Remaining 

[jira] [Issue Comment Edited] (CONNECTORS-288) An ElasticSearch connector would be helpful

2012-02-26 Thread Karl Wright (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216851#comment-13216851
 ] 

Karl Wright edited comment on CONNECTORS-288 at 2/26/12 9:16 PM:
-

Looking at the actual test run, the history reports the following at the end:

{code}
02-26-2012 16:09:25.129 Indexation (ElasticSearch)  
http://localhost:9200/index/_optimize
OK  0   69  
02-26-2012 16:09:24.939 job end 1330290457146(Test Job)
0   1   
02-26-2012 16:09:14.909 Deletion (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
OK  0   7   
02-26-2012 16:09:07.787 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/140/null
OK  27  6   
02-26-2012 16:09:07.778 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/139/null
OK  27  7   
02-26-2012 16:09:07.769 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/137/null
OK  27  15  
02-26-2012 16:09:05.278 job start   1330290457146(Test Job)
0   1   
02-26-2012 16:08:55.020 Indexation (ElasticSearch)  
http://localhost:9200/index/_optimize
OK  0   93  
02-26-2012 16:08:54.926 job end 1330290457146(Test Job)
0   1   
02-26-2012 16:08:47.678 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
OK  27  10  
02-26-2012 16:08:47.666 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/140/null
OK  27  6   
02-26-2012 16:08:47.652 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/139/null
OK  27  11  
02-26-2012 16:08:47.646 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/137/null
OK  27  13  
02-26-2012 16:08:45.192 job start   1330290457146(Test Job)
0   1   
02-26-2012 16:08:34.940 Indexation (ElasticSearch)  
http://localhost:9200/index/_optimize
OK  0   75  
02-26-2012 16:08:34.917 job end 1330290457146(Test Job)
0   1   
02-26-2012 16:08:29.502 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/139/null
OK  27  10  
02-26-2012 16:08:29.491 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/137/null
OK  27  8   
02-26-2012 16:08:29.412 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/140/null
OK  27  66  
02-26-2012 16:08:29.404 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
OK  27  68  
02-26-2012 16:08:25.097 job start   1330290457146(Test Job)
0   1   
02-26-2012 16:08:24.846 Indexation (ElasticSearch)  
http://localhost:9200/index/_optimize
OK  0   88  
02-26-2012 16:08:14.890 job stop1330290457146(Test Job)
0   1   
02-26-2012 16:08:09.041 Optimize (ElasticSearch)
http://localhost:9090/chemistry-opencmis-server-inmemory/atom...
/138/null
OK  27  868 
02-26-2012 16:08:04.900 job start   1330290457146(Test Job)
0   1   
{code}

The job at the end is stuck in the Cleaning up state, which indicates that it 
is trying to delete the documents from the index, but is not succeeding for 
some reason.  The jobstatus reports 4 documents at that time.

The CMIS connector is not helping here because it does not seem to record ANY 
activities.  It also looks like the activities being recorded for the 
ElasticSearch connector are backwards; it records Optimize when it should 
record Indexation, and visa versa.



  was (Author: kwri...@metacarta.com):
Looking at the actual test run, the history reports the following at the 
end:

{code}
02-26-2012 16:09:25.129 Indexation (ElasticSearch)  
http://localhost:9200/index/_optimize
OK  0   69  
02-26-2012 16:09:24.939 job end 1330290457146(Test Job)
0   1   
02-26-2012 16:09:14.909 Deletion (ElasticSearch)

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

2012-02-26 Thread Karl Wright (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216966#comment-13216966
 ] 

Karl Wright commented on CONNECTORS-288:


Just checked the manifoldcf.log file from the test crawl.  Here's a snippet:

{code}
ERROR 2012-02-26 16:08:09,921 (Worker thread '4') - Exception tossed: 
org.apache.manifoldcf.core.interfaces.ManifoldCFException: 
at 
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:111)
at 
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchIndex.init(ElasticSearchIndex.java:100)
at 
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.addOrReplaceDocument(ElasticSearchConnector.java:357)
at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1579)
at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:504)
at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:370)
at 
org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1577)
at 
org.apache.manifoldcf.crawler.connectors.cmis.CmisRepositoryConnector.processDocuments(CmisRepositoryConnector.java:1162)
at 
org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:561)
ERROR 2012-02-26 16:09:35,903 (Document delete thread '7') - Exception tossed: 
Server/page not found
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Server/page not found
at 
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:111)
at 
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete.init(ElasticSearchDelete.java:35)
at 
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.removeDocument(ElasticSearchConnector.java:378)
at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.removeDocument(IncrementalIngester.java:1598)
at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentDeleteMultiple(IncrementalIngester.java:748)
at 
org.apache.manifoldcf.crawler.system.DocumentDeleteThread.run(DocumentDeleteThread.java:130)
ERROR 2012-02-26 16:09:36,908 (Document delete thread '9') - Exception tossed: 
Server/page not found
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Server/page not found
at 
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:111)
at 
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete.init(ElasticSearchDelete.java:35)
at 
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.removeDocument(ElasticSearchConnector.java:378)
at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.removeDocument(IncrementalIngester.java:1598)
at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentDeleteMultiple(IncrementalIngester.java:748)
at 
org.apache.manifoldcf.crawler.system.DocumentDeleteThread.run(DocumentDeleteThread.java:130)
ERROR 2012-02-26 16:09:37,907 (Document delete thread '8') - Exception tossed: 
Server/page not found
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Server/page not found
at 
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:111)
at 
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchDelete.init(ElasticSearchDelete.java:35)
at 
org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.removeDocument(ElasticSearchConnector.java:378)
at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.removeDocument(IncrementalIngester.java:1598)
at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentDeleteMultiple(IncrementalIngester.java:748)
at 
org.apache.manifoldcf.crawler.system.DocumentDeleteThread.run(DocumentDeleteThread.java:130)
{code}


 An ElasticSearch connector would be helpful
 ---

 Key: CONNECTORS-288
 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
 Project: ManifoldCF
  Issue Type: New Feature
Affects Versions: ManifoldCF 0.5
Reporter: Piergiorgio Lucidi
Assignee: Piergiorgio Lucidi
  Labels: elasticsearch
 Fix For: ManifoldCF next

 Attachments: 

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

2012-02-26 Thread Karl Wright (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216971#comment-13216971
 ] 

Karl Wright commented on CONNECTORS-288:


More code review:

The architecture you are using to cache specifications could use some 
improvement, I think.  The method getOutputDescription() is not meant to 
perform a blind conversion of the output specification to a string, but to 
include only those parameters that, if changed, would change what was indexed.  
Furthermore, it is expected that the format of the string be such that it is 
quickly unpackable, so that no caching should be necessary even if parameters 
need to be parsed from the string.  To help, there are a set of pack/unpack 
methods available for your use from the base class that are reasonably 
performant and meant for this purpose.  See the Solr connector for an idea how 
these are used.  Or, you can continue to use JSON, but when you go back and 
forth to JSON I suspect you're doing more work than the pack/unpack methods 
would do.

If you do decide to cache things for whatever reason, I would urge you to use 
the ICacheManager construct, since that will be guaranteed to be maintained 
over the long run.  Ideally, your code when done should not have any 
synchronize blocks in it at all, since synchronization is managed largely by 
the framework.

Another subject we should talk about is managing the HTTP connection pool.  I 
noted that you put pool management into one of the subclasses 
(ElasticSearchConnection).  The problem with that is that you want the lifetime 
of the pool to be the lifetime of the ElasticSearchConnector class instance, 
otherwise the pool is not going to do you much good.  So I would move the 
MultiThreadedHTTPConnectionManager instance to the main ElasticSearchConnector 
class, and provide an ElasticSearchConnector method that fetches an HttpClient 
object from that instance - or just pass it in when you construct 
ElasticSearchConnection.  Also, don't forget to hook up the poll() method to 
the MultiThreadedHTTPConnectionManager instance so that connections will be 
closed when idle.  See the SharePoint connector for an idea how this is done.


 An ElasticSearch connector would be helpful
 ---

 Key: CONNECTORS-288
 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
 Project: ManifoldCF
  Issue Type: New Feature
Affects Versions: ManifoldCF 0.5
Reporter: Piergiorgio Lucidi
Assignee: Piergiorgio Lucidi
  Labels: elasticsearch
 Fix For: ManifoldCF next

 Attachments: manifold-elasticsearch-patch, 
 manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
 manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
 manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
 manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct

   Original Estimate: 120h
  Remaining Estimate: 120h

 An ElasticSearch connector could be very useful to spread the use of 
 ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira