[xwiki-devs] Review of the GSOC SOLR work for integration in XWiki Platform

Vincent Massol Tue, 28 Aug 2012 05:05:05 -0700

Hi Savitha,

I've started reviewing quickly the SOLR code in preparation for an integration 
in the platform and I have some questions which I have jotted down below as I 
was reviewing the code. Sorry for the terse format, I actually wrote the 
questions to myself and then decide to send them as is :)


General:

* Need an architecture diagram showing the main modules and threads and how 
they interact with the platform

Search-api:

* Is the Search API supposed to be independent of SOLR?
* Search interface is strange, it has implementation details such as: 
getImplementation(), initialize(), 
* It also has other concerns such as getStatus(), getStatusAsJson(), 
getVelocityUtils(), getSearchRequest()
* Why do we need a Search interface? Why not instead use the Query module and 
introduce a new query type? (note return List from Query.execute() probably 
needs to be clarified). Replace SearchRequest with Query impl
* Naming of interfaces are a bit strange. For example: BuildIndex; should it be 
IndexBuilder instead? What about DeleteIndex, should it be IndexDeleter?
* I don't think we need deleteDocumentIndex(), deleteWikiIndex(), 
deleteSpaceIndex(), etc. We need a single deleteEntity(EntityReference 
reference, EntityType type). Same for IndexBuilder.
* Why is there a DocumentIndexer interface? Why is a Document different from 
other entities? For ex I can see DocumentIndexer.deleteIndex() why not 
IndexDeleter.deleteEntity(documentRef)?
* Why is there a need for RebuildIndex (which I assume is IndexRebuilder) and 
why cannot we use the IndexBuilder?
* Why the need for SearchIndex?

Search-solrj:

* solrj server in embedded mode is used. 
* Shouldn't use system property but the xwiki configuration instead for the 
solrj home (see below in misc)
* EmbeddedSolrServer depends on Servlet API? "Also, if using 
EmbeddedSolrServer, keep in mind that Solr depends on the Servlet API. " from 
http://wiki.apache.org/solr/Solrj
* EmbeddedSolrServer should be started by listening to the app started event 
instead of lazily in Initializable IMO
* Since we use EmbeddedSolrServer how do we handle clustering? One instance per 
wiki instance? How do they reconcile their indexes? Need an architecture 
diagram for our solution for heavy loads.

Misc:

* all API to review and improve/stabilize
* typos to fix
* licenses to fix
* pom to fix
* missing class javadoc (eg BuildIndex, DeleteIndex, etc)
* exception handling to verify  (ex: SolrjSearchEngine)
* Remove unneeded javadoc when @override
* Need to use the XWiki Permanent Directory for storing SOLR configuration data 
(the solr home) - Need to move data currenty in solr/ in a solr-configuration 
jar module which gets used as a fallback if the data doesn't exist in the solr 
home dir.
* Idea: use solr JMX to provide admin features 
(http://wiki.apache.org/solr/SolrJmx)
* TODO: Think about how to migrate users to use SOLR instead of Lucene or DB 
Searches. Need a plan.

Thanks!
-Vincent

_______________________________________________
devs mailing list
[email protected]
http://lists.xwiki.org/mailman/listinfo/devs

[xwiki-devs] Review of the GSOC SOLR work for integration in XWiki Platform

Reply via email to