[
https://issues.apache.org/jira/browse/MAILBOX-155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tellier Benoit updated MAILBOX-155:
-----------------------------------
Attachment: MAILBOX-155.patch
This patch contains our implementation for an elasticSearch index.
# **Features**
It is an implementation for a MessageSearch index. It will behave well in a
distributed context.
Mail are indexed in order to allow other application to get greate search
result as :
# *Note before running tests*
*Dificulties we encontered*
- mailbox related events are also indexed ( it allow you to write queries to,
for instance search for an e-mail in all mailbox of a user ).
- Attachement are indexed if they are text ( I will submit arbitrary
attachment tomorrow ).
As we have conflicting depandancies with Apache Lucene module, we made the
choice to "make things in a diffierent way".
- We used Jest as a client
- We developped our own query builder.
Note that because of this problem, running tests require a specific test
environment, with ElasticSearch configured. That's why I commented some tests.
*Sub project structure*
In the ElasticSearch module you will find :
- Indexes ( we have 3 different implementations )
- One that index directly in ElasticSearch
- One that index threw a Kafka queue ( desynchronize mail processing from
indexing )
- One that uses an embedded Kafka ( less infrastructure requirements )
Kafka modules demands to configure the river :
https://ci.open-paas.org/stash/projects/JWC/repos/kafka-river/browse
You have a Bulk generation unit, a query unit ( deals with converting james
serch requests into ElasticSearch requests ).
You will find our query builder in dsl folder. It implements only operation we
need. We were forced to write it as Jest demand to use either String or use
ElasticSearch one ( that relies on Lucene ).
We need to generate JSON from messages. We did this in store as we thought
other part of James might need it some day.
We also created an Exception dedicated to an offline ElasticSearch.
*What should I do before running tests?*
Tests are provided for each added component.
To test elasticSearch integration with no Kafka :
requirement : Standalone ElasticSearch
uncomment :
elasticsearch/src/test/java/org/apache/james/mailbox/elasticsearch/search/index/DirectMessageSearchIndexTest.java
To test ElasticSearch integration with a Kafka :
requirement : Standalone ElasticSearch with configured river, Standalone Kafka
uncomment :
elasticsearch/src/test/java/org/apache/james/mailbox/elasticsearch/search/index/KafkaMessageSearchIndexTest.java
To test ElasticSearch integration threw EmbeddedKafka :
requirement : ElasticSearch with a configured river
uncomment :
elasticsearch/src/test/java/org/apache/james/mailbox/elasticsearch/search/index/EmbeddedKafkaSearchIndexTest.java
> Add elasticsearch based search index
> ------------------------------------
>
> Key: MAILBOX-155
> URL: https://issues.apache.org/jira/browse/MAILBOX-155
> Project: James Mailbox
> Issue Type: New Feature
> Reporter: Norman Maurer
> Assignee: Norman Maurer
> Attachments: MAILBOX-155.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]