Hi Jorn, I will came back from a little holidays period next week, in the meantime here's a quick list of thoughts on the points you raised. regarding asserts in the initialize() method they can be safely removed as they were put there mainly for debugging purpose, however the initialization of the Consumer would fail if such params are null or badly defined as you can see inside the createServer(type,path) and inside the FieldMappingReader.getConf(path) methods the cas element in the mapping file is an optional one and I thought it was useful to track the cas which delivered information, in the sample file it gets mapped inside an id field but it doesn't mean it MUST be unique; however that is optional and maybe the toString() method isn't the best one to store the cas information, but I still think it makes sense to not loose such an information. I agree with the need to switch to the CAS API I agree also regarding the enhancing the exception handling for debugging errors; if commit fails I think that should be handled the same way as an add() fails otherwise it should be created a commit policy (i.e. a cache of documents previously added to try to re-send them) parameter but I think it's out of the scope of a basic Solrcas implementation and more related to how Solr handles commit errors I'd introduce the already discussed autocommit configuration parameter (boolean) to indicate if Solrcas should also send a commit to the SolrServer (it may also make sense to create a third value for this param called 'destroy' that would trigger the commit only on the destroy() method even if in that case any errors during the commit could not be recovered) regarding the EmbeddedSolrServer I agree that it's generally not a top option in production but I am working now with a Solr project where network latency has a significance impact (being Solr the best solution anyways) and I'd get a considerable advantage if I can query it avoiding HTTP requests that way, however since the main way to query Solr is via REST calls I have no objections removing it thanks for the fix on UIMA-2041 Cheers, Tommaso
Il giorno 08/feb/2011, alle ore 15.52, Jörn Kottmann ha scritto: > I am trying to understand why there is a cas element in the mapping file. > > The documentation explains the it specifies the field in solr > which is used to map the value of JCas.toString(), but why should > anyone wants to do that? The documentation sample maps it to > the id field in solr. > > Can JCas.toString be used as an id? > If I looked at the code correctly JCasImpl does not overwrite toString, > then simply Object.toString is called which produces a string based on > the object address. In one JVM two objects could have the same address > at two distinct points in time. Which would lead to identical ids for > different > documents. > Anyway isn't the JCas instance reused? Then this will just be the same > string depending on the instance over and over again. > > Jörn >
