[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Karl Wright (Commented) (JIRA) Sun, 26 Feb 2012 16:59:15 -0800

    [ 
https://issues.apache.org/jira/browse/CONNECTORS-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216971#comment-13216971
 ]


Karl Wright commented on CONNECTORS-288:
----------------------------------------

More code review:

The architecture you are using to cache specifications could use some 
improvement, I think.  The method getOutputDescription() is not meant to 
perform a blind conversion of the output specification to a string, but to 
include only those parameters that, if changed, would change what was indexed.  
Furthermore, it is expected that the format of the string be such that it is 
quickly unpackable, so that no caching should be necessary even if parameters 
need to be parsed from the string.  To help, there are a set of pack/unpack 
methods available for your use from the base class that are reasonably 
performant and meant for this purpose.  See the Solr connector for an idea how 
these are used.  Or, you can continue to use JSON, but when you go back and 
forth to JSON I suspect you're doing more work than the pack/unpack methods 
would do.

If you do decide to cache things for whatever reason, I would urge you to use 
the ICacheManager construct, since that will be guaranteed to be maintained 
over the long run.  Ideally, your code when done should not have any 
synchronize blocks in it at all, since synchronization is managed largely by 
the framework.

Another subject we should talk about is managing the HTTP connection pool.  I 
noted that you put pool management into one of the subclasses 
(ElasticSearchConnection).  The problem with that is that you want the lifetime 
of the pool to be the lifetime of the ElasticSearchConnector class instance, 
otherwise the pool is not going to do you much good.  So I would move the 
MultiThreadedHTTPConnectionManager instance to the main ElasticSearchConnector 
class, and provide an ElasticSearchConnector method that fetches an HttpClient 
object from that instance - or just pass it in when you construct 
ElasticSearchConnection.  Also, don't forget to hook up the poll() method to 
the MultiThreadedHTTPConnectionManager instance so that connections will be 
closed when idle.  See the SharePoint connector for an idea how this is done.

                
> An ElasticSearch connector would be helpful
> -------------------------------------------
>
>                 Key: CONNECTORS-288
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-288
>             Project: ManifoldCF
>          Issue Type: New Feature
>    Affects Versions: ManifoldCF 0.5
>            Reporter: Piergiorgio Lucidi
>            Assignee: Piergiorgio Lucidi
>              Labels: elasticsearch
>             Fix For: ManifoldCF next
>
>         Attachments: manifold-elasticsearch-patch, 
> manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
> manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
> manifold-elasticsearch-patch, manifold-elasticsearch-patch, 
> manifold-elasticsearch-velocity-patch, manifoldcf-elasticsearch-project-patct
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> An ElasticSearch connector could be very useful to spread the use of 
> ManifoldCF

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CONNECTORS-288) An ElasticSearch connector would be helpful

Reply via email to