Re: AEM SOLR integaration

2017-09-25 Thread Tommaso Teofili
integrating can be done in AEM at different layers, however my suggestion would be to enable that at the repository (Oak) level [1] so that usual AEM search would also take ACLs into account. [1] : http://jackrabbit.apache.org/oak/docs/query/solr.html Il giorno ven 22 set 2017 alle ore 18:47

Re: Knn classifier doesn't work

2017-09-19 Thread Tommaso Teofili
hi Alessandro, yes please, feel free to open a Jira issue, patches welcome ! Tommaso Il giorno lun 18 set 2017 alle ore 14:30 alessandro.benedetti < a.benede...@sease.io> ha scritto: > Hi Tommaso, > you are definitely right! > I see that the method : MultiFields.getTerms > returns : > if

Re: multi language search engine in solr

2017-09-11 Thread Tommaso Teofili
another thing to consider is what users would expect, would english user search over english docs only ? if yes, the most important task would be to correctly set up / create accurate per language analyzers, otherwise you may consider to also adopt machine translation, either on the search queries

Re: Knn classifier doesn't work

2017-09-02 Thread Tommaso Teofili
it would sound like none of the docs in your index has the "class" field, in your case Tags, whereas classification needs some bootstrapping (add some examples of correctly classified docs to the index beforehand). On the other hand the naive bayes implementation has definitely a bug as the

Re: Exception during integration of Solr with UIMA

2017-03-20 Thread Tommaso Teofili
Hi, the UIMA OpenCalais Annotator you're using refers to an old endpoint which is no longer available, see log line [1]. I would suggest to simply remove the OpenCalaisAnnotator entry from your UIMAUpdateRequestProcessor configuration in solrconfig.xml. More generally you should put only the UIMA

Re: Solr UIMA Custom Annotator PEAR file installation on Linux

2016-01-08 Thread Tommaso Teofili
Hi, do you mean you want to use a PEAR to provide the Annotator for the Solr UIMA UpdateProcessor ? Can you please detail a bit more your needs? Regards, Tommaso 2016-01-08 1:57 GMT+01:00 techqnq : > implemented custom annotator and generated the PEAR file. > Windos has the

Re: Using SimpleNaiveBayesClassifier in solr

2015-10-12 Thread Tommaso Teofili
Hi Yewint, the SNB classifier is not an online one, so you should retrain it every time you want to update it. What you pass to the Classifier is a Reader therefore you should grant that this keeps being accessible (not close it) for classification to work. Regarding performance SNB becomes

Re: solr uima and opennlp

2015-06-01 Thread Tommaso Teofili
yeah, I think you'd rather post it to d...@uima.apache.org . Regards, Tommaso 2015-05-28 15:19 GMT+02:00 hossmaa andreea.hossm...@gmail.com: Hi Tommaso Thanks for the quick reply! I have another question about using the Dictionary Annotator, but I guess it's better to post it separately.

Re: solr uima and opennlp

2015-05-21 Thread Tommaso Teofili
Hi Andreaa, 2015-05-21 18:12 GMT+02:00 hossmaa andreea.hossm...@gmail.com: Hi everyone I'm trying to plug in a new UIMA annotator into solr. What is necessary for this? Is is enough to build a Jar similarly to the ones from the uima-addons package? yes, exactly. Actually you just need a

Re: /suggest through SolrJ?

2015-04-29 Thread Tommaso Teofili
2015-04-27 19:22 GMT+02:00 Alessandro Benedetti benedetti.ale...@gmail.com : Just had the very same problem, and I confirm that currently is quite a mess to manage suggestions in SolrJ ! I have to go with manual Json parsing. or very not nice NamedList API mess (see an example in JR Oak

Re: Issue with multivalued fields in UIMA

2014-08-29 Thread Tommaso Teofili
Hi, it'd be good if you could open a Jira issues (with a patch preferably) describing your findings. Thanks, Tommaso 2014-08-29 18:34 GMT+02:00 mkhordad khorda...@gmail.com: I solved it. It was caused by a bug in UIMAUpdateRequestProcessor. -- View this message in context:

Tika analyzers

2014-07-30 Thread Tommaso Teofili
Hi all, while SolrCell works nicely when in need of indexing binary documents, I am wondering about the possibility of having Lucene / Solr documents that have binaries in specific Lucene fields, e.g. title=a nice doc, nameblabla.doc, binary=0x1234 In that case the binary field should have

Re: Integrate solr with openNLP

2014-06-04 Thread Tommaso Teofili
Hi all, Ahment was suggesting to eventually use UIMA integration because OpenNLP has already an integration with Apache UIMA and so you would just have to use that [1]. And that's one of the main reason UIMA integration was done: it's a framework that you can easily hook into in order to plug

Re: deep paging without sorting / keep IRs open

2014-05-19 Thread Tommaso Teofili
thanks Yonik, that looks promising, I'll have a look at it. Tommaso 2014-05-17 17:57 GMT+02:00 Yonik Seeley yo...@heliosearch.com: On Sat, May 17, 2014 at 10:30 AM, Yonik Seeley yo...@heliosearch.com wrote: I think searcher leases would fit the bill here?

deep paging without sorting / keep IRs open

2014-05-15 Thread Tommaso Teofili
Hi all, in one use case I'm working on [1] I am using Solr in combination with a MVCC system [2][3], so that the (Solr) index is kept up to date with the system and must handle search requests that are tied to a certain state / version of it and of course multiple searches based on different

Re: [Clustering] Full-Index Offline cluster

2014-03-10 Thread Tommaso Teofili
Hi Ahmet, Ale, right, there's a classification module for Lucene (and therefore usable in Solr as well), but no clustering support there. Regards, Tommaso 2014-03-10 19:15 GMT+01:00 Ahmet Arslan iori...@yahoo.com: Hi, Thats weird. As far as I know there is no such thing. There is

Re: Caching requests to Solr

2014-03-08 Thread Tommaso Teofili
following up on this, I've created https://issues.apache.org/jira/browse/SOLR-5826 , with a draft patch. Regards, Tommaso 2014-03-05 8:50 GMT+01:00 Tommaso Teofili tommaso.teof...@gmail.com: Hi all, I have the following requirement where I have an application talking to Solr via SolrJ where

Caching requests to Solr

2014-03-04 Thread Tommaso Teofili
Hi all, I have the following requirement where I have an application talking to Solr via SolrJ where I don't know upfront which type of Solr instance that will be communicating with, while this is easily solvable by using different SolrServer implementations I also need a way to ensure that all

Re: Alternatives to GATE?

2014-01-16 Thread Tommaso Teofili
If you need a framework to build your enhancement pipeline on I think Apache UIMA [1] is good as it's also able to store annotated documents into Lucene and Solr so it may be a good fit for your needs. Just consider that you have to learn how to use / develop on top of it, it's not a big deal but

Re: Too slow UIMA with Solr

2013-08-29 Thread Tommaso Teofili
Hi Jun, I agree the AE (instead of the AEProvider) should be cached on the UpdateRequestProcessor. In previous revisions [1] it was cached directly by the BasicAEProvider so there wasn't need of that in the UIMAUpdateRequestProcessor but, since that has changed, I agree that should be done there

Re: Too slow UIMA with Solr

2013-08-29 Thread Tommaso Teofili
p.s. see https://issues.apache.org/jira/browse/SOLR-5201 2013/8/29 Tommaso Teofili tommaso.teof...@gmail.com Hi Jun, I agree the AE (instead of the AEProvider) should be cached on the UpdateRequestProcessor. In previous revisions [1] it was cached directly by the BasicAEProvider so

Re: Document Similarity Algorithm at Solr/Lucene

2013-07-23 Thread Tommaso Teofili
Hi, I you may leverage and / or improve MLT component [1]. HTH, Tommaso [1] : http://wiki.apache.org/solr/MoreLikeThis 2013/7/23 Furkan KAMACI furkankam...@gmail.com Hi; Sometimes a huge part of a document may exist in another document. As like in student plagiarism or quotation of a

Re: Document Similarity Algorithm at Solr/Lucene

2013-07-23 Thread Tommaso Teofili
to help you mark other texts as quote / plagiarism HTH, Tommaso 2013/7/23 Furkan KAMACI furkankam...@gmail.com Actually I need a specialized algorithm. I want to use that algorithm to detect duplicate blog posts. 2013/7/23 Tommaso Teofili tommaso.teof...@gmail.com Hi, I you may leverage

Re: Solr UIMA

2013-02-21 Thread Tommaso Teofili
Hi Bart, I think the only way you can do that is by reindexing, or maybe by just doing a dummy atomic update [1] to each of the documents (e.g. adding or changing a field of type 'ignored' or something like that) that weren't tagged by UIMA before. Regards, Tommaso [1] :

Re: which analyzer is used for facet.query?

2013-02-13 Thread Tommaso Teofili
I agree that's definitely strange, I'll have a look at it. Tommaso 2013/2/12 Chris Hostetter hossman_luc...@fucit.org : So it seems that facet.query is using the analyzer of type index. : Is it a bug or is there another analyzer type for the facet query? That doesn't really make any

Re: Indexing nouns only with UIMA works - performance issue?

2013-02-05 Thread Tommaso Teofili
descriptorPath=/uima/AggregateSentenceAE.xml tokenType=org.apache.uima.SentenceAnnotation ngramsize=2 modelFile=file:german/TuebaModel.dat / ??? Thanks, Kai -Original Message- From: Tommaso Teofili [mailto:tommaso.teof...@gmail.com] Sent: Monday, February 04, 2013 2:47 PM

Re: Indexing nouns only with UIMA works - performance issue?

2013-02-04 Thread Tommaso Teofili
Thanks Kai for your feedback, I'll look into it and let you know. Regards, Tommaso 2013/2/1 Kai Gülzau kguel...@novomind.com I now use the stupid way to use the german corpus for UIMA: copy + paste :-) I modified the Tagger-2.3.1.jar/HmmTagger.xml to use the german corpus ...

Re: Indexing nouns only with UIMA works - performance issue?

2013-02-04 Thread Tommaso Teofili
Regarding configuration parameters have a look at https://issues.apache.org/jira/browse/LUCENE-4749 Regards, Tommaso 2013/2/4 Tommaso Teofili tommaso.teof...@gmail.com Thanks Kai for your feedback, I'll look into it and let you know. Regards, Tommaso 2013/2/1 Kai Gülzau kguel

Re: Indexing nouns only with UIMA works - performance issue?

2013-02-04 Thread Tommaso Teofili
with the given actual value. HTH, Tommaso 2013/2/4 Tommaso Teofili tommaso.teof...@gmail.com Regarding configuration parameters have a look at https://issues.apache.org/jira/browse/LUCENE-4749 Regards, Tommaso 2013/2/4 Tommaso Teofili tommaso.teof...@gmail.com Thanks Kai for your feedback

Re: Solr UIMA with KEA

2012-11-23 Thread Tommaso Teofili
the AlchemyAPI service is not mandatory (it's there just as an example and can be safely removed), you can use whatever service you want as long as it's wrapped by a UIMA AnalysisEngine and you specify its descriptor. See following updateChain example configuration : updateRequestProcessorChain

Re: UIMA for lemmatization

2012-09-25 Thread Tommaso Teofili
Hi, I think you'd better ask this on u...@uima.apache.org list as this is more related to Apache UIMA itself rather than to Apache Solr. Regards, Tommaso 2012/9/25 abhayd ajdabhol...@hotmail.com hi I m new to UIMA. Solr doea not have lemmatization component, i was thinking of using UIMA

Re: Backup strategy for SolrCloud

2012-09-21 Thread Tommaso Teofili
I also think that's a good question and currently without a use this answer :-) I think it shouldn't be hard to write a Solr service querying ZK and replicate both conf and indexes (via SnapPuller or ZK itself) so that such a node is responsible to back up the whole cluster in a secure storage

Re: Embedded Server Issue : SOLRJ : No Such Core Found

2012-09-19 Thread Tommaso Teofili
Hi Senthil, try using the following: CoreContainer coreContainer = new CoreContainer.Initializer().initialize(); SolrServer solrServer = new EmbeddedSolrServer(coreContainer, collection1); Hope it helps, Tommaso 2012/9/19 Senthil Kk Mani sentm...@in.ibm.com Hi, I am facing an issue

Re: Levenstein Distance

2012-06-07 Thread Tommaso Teofili
During the analysis phase you could add payloads to the terms using LevensteinDistance and then use that in conjunction with a PayloadSimilarity class ´See [1] for an example), or just use a custom Similarity class which uses LevensteinDistance for scoring. HTH Tommaso [1] :

Re: Solr with UIMA

2012-06-04 Thread Tommaso Teofili
Hi all, 2012/6/1 Jack Krupansky j...@basetechnology.com Is it failing on the first document? I see uid 5, suggests that it is not. If not, how is this document different from the others? I see the exception org.apache.uima.resource.**ResourceInitializationExceptio**n, suggesting that some

Re: shard distribution of multiple collections in SolrCloud

2012-05-24 Thread Tommaso Teofili
2012/5/23 Mark Miller markrmil...@gmail.com Yeah, currently you have to create the core on each node...we are working on a 'collections' api that will make this a simple one call operation. Mark, is there a Jira for that yet? Tomamso We should have this soon. - Mark On May 23, 2012,

Re: shard distribution of multiple collections in SolrCloud

2012-05-24 Thread Tommaso Teofili
and try to help there. Tommaso On May 24, 2012, at 4:39 AM, Tommaso Teofili wrote: 2012/5/23 Mark Miller markrmil...@gmail.com Yeah, currently you have to create the core on each node...we are working on a 'collections' api that will make this a simple one call operation. Mark

Re: Problem with AND clause in multi core search query

2012-05-15 Thread Tommaso Teofili
The latter is supposed to work: http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=column1 :A OR column2:B The first query cannot work as there is no document neither in core0 nor in core1 which has A in field column1 and B in field column2 but

Re: Solr with UIMA

2012-04-04 Thread Tommaso Teofili
Hi again Chris, I finally manage to find some proper time to test your configuration. First thing to notice is that it worked for me assuming the following pre-requisites were satisfied: - you had the jar containing the AnalysisEngine for the RoomAnnotator.xml in your libraries section (this is

Re: Using UIMA in Solr behind a firewall

2012-04-04 Thread Tommaso Teofili
Hello Peter, I think that is more related to UIMA AlchemyAPIAnnotator [1] or to AlchemyAPI services themselves [2] because Solr just use the out of the box UIMA AnalysisEngine for that. Thus it may make sense to ask on d...@uima.apache.org (or even directly to AlchemyAPI guys). HTH, Tommaso [1]

Re: Solr with UIMA

2012-03-28 Thread Tommaso Teofili
Hi Chris, 2012/3/28 chris3001 chrislia...@hotmail.com I am having a hard time integrating UIMA with Solr. I have downloaded the Solr 3.5 dist and have it successfully running with nutch and tika on windows 7 using solrcell and curl via cygwin. To begin, I copied the 6 jars from

Re: Solr with UIMA

2012-03-28 Thread Tommaso Teofili
Hi Chris, I did never tried the Nutch integration so I can't help with that. However I'll try to repeat your same setup and will let you know what it comes out for me. Tommaso 2012/3/28 chris3001 chrislia...@hotmail.com Still not getting there on Solr with UIMA... Has anyone taken example 1

Re: Solr Monitoring / Stats

2012-03-15 Thread Tommaso Teofili
would http://www.lucidimagination.com/blog/2011/10/02/monitoring-apache-solr-and-lucidworks-with-zabbix/work for your scenario? Tommaso 2012/3/12 Alex Leonhardt aleonha...@venda.com Hi All, I was wondering if anyone knows of a free tool to use to monitor multiple Solr hosts under one roof ?

Re: Reporting tools

2012-03-09 Thread Tommaso Teofili
as Gora says there is the stats component you can take advantage of or you could also use JMX directly [1] or LucidGaze [2][3] or commercial services like [4] or [5] (these are the ones I know but there may be also others), each of them with different level/type of service. Tommaso [1] :

Re: in solr how to support Document.SetBoost as lucene?

2012-03-07 Thread Tommaso Teofili
when indexing a Solr document by sending XML files via HTTP POST you can set it adding the boost element to the doc one, see http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_on_.22doc.22 If you plan to index using the java APIs (SolrJ, see http://wiki.apache.org/solr/Solrj) you

Re: performance between ExternalFileField and Join

2012-03-01 Thread Tommaso Teofili
Also regarding the Join functionality I remember Yonik pointed out it's O(# unique terms) but I agree with Erik on the ExternalFileField as you can use it just inside a function query, for example, for boosting. Tommaso 2012/3/1 Erick Erickson erickerick...@gmail.com Hmmm. ExternalFileFields

Re: proper syntax for using sort query parameter in responseHandler

2012-02-17 Thread Tommaso Teofili
Hi Mark, Having a look at that requestHandler it looks ok [1], are you experiencing any errors? If so did you check the wiki page FieldOptionsByUseCase [2], maybe that field (rankNo) options contain indexed=false or multiValued=true? HTH, Tommaso [1] :

Re: How to do this in Solr? random result for the first few results

2012-02-09 Thread Tommaso Teofili
I think you may use/customize the query elevation component to achieve that. http://wiki.apache.org/solr/QueryElevationComponent Tommaso 2012/2/9 mtheone mthe...@gmail.com Say I have a classified ads site, I want to display 2 random items (premium ads) in the beginning of the search result and

Re: Sorting solrdocumentlist object after querying

2012-02-09 Thread Tommaso Teofili
Hi Kashif, maybe the field collapsing feature [1] may help you with your requirement. Hope this helps, Tommaso [1] : http://wiki.apache.org/solr/FieldCollapsing

Re: How to get the time document was indexed?

2012-01-20 Thread Tommaso Teofili
Hi Alex, you can create a field in the schema.xml of type date or tdate called (something like) idx_timestamp and set its default option to NOW then you won't have to add any extra fields to the documents because it will be automatically created when documents are indexed. Hope it helps. Tommaso

Re: Problems with SolrUIMA

2011-12-10 Thread Tommaso Teofili
Hello Adriana, your configuration looks fine to me. The exception you pasted makes me think you're using a Solr instance at a certain version (3.4.0) while the Solr-UIMA module jar is at a different version; I remember there has been a change in the UpdateRequestProcessorFactory API at some point

Re: Document Processing

2011-12-06 Thread Tommaso Teofili
Hello Michael, I can help you with using the UIMA UpdateRequestProcessor [1]; the current implementation uses in-memory execution of UIMA pipelines but since I was planning to add the support for higher scalability (with UIMA-AS [2]) that may help you as well. Tommaso [1] :

Re: Upgratding the Index from 1.4.1 to 3.4 using replication

2011-10-27 Thread Tommaso Teofili
I don't think it'll work as I've tried this approach myself and the blocking issue was that Solr 1.4.1 use a different javabin version than Solr 3.4 (I think it's 1 vs 2) so the master and the slave(s) can't communicate using standard replication handler and thus can't exchange information and

Re: UIMA DictionaryAnnotator partOfSpeach

2011-09-29 Thread Tommaso Teofili
I think one problem is that the featurePath is not set correctly. Note that you are assuming PoS are written somewhere in some annotation feature so this mean you should've setup the UIMA pipeline to include also, for example, the HMM Tagger [1] which adds (by default) the posTag feature to

Different Solr versions between Master and Slave(s)

2011-09-19 Thread Tommaso Teofili
Hi all, while thinking about a migration plan of a Solr 1.4.1 master / slave architecture (1 master with N slaves already in production) to Solr 3.x I imagined to go for a graceful migration, starting with migrating only one/two slaves, making the needed tests on those while still offering the

Re: solr UIMA exception

2011-08-29 Thread Tommaso Teofili
The UIMA AlchemyAPI annotator is failing for you due to an error no server side and I think you should look at your Solr UIMA configuration as it seem you wanted to extract entities from text: Senator Dick Durbin (D-IL) Chicago , March 3,2007. while the error says

Re: Solr UIMA integration problem

2011-08-17 Thread Tommaso Teofili
At a first glance I think the problem is in the 'feature' element which is set to 'title'. The 'feature' element should contain a UIMA Feature of the type defined in element 'type'; for example for SentenceAnnotation [1] defined in HMM Tagger has 'only' the default features of a UIMA Annotation:

Re: (Solr-UIMA) Indexing problems with UIMA fields.

2011-07-14 Thread Tommaso Teofili
://solrurl:solrport/solr/admin/logging Hope this helps, Tommaso S On Wed, Jul 13, 2011 at 4:48 PM, Tommaso Teofili tommaso.teof...@gmail.comwrote: Hello, I think the problem might be the following, if you defined the update request handlers like in the sample solrconfig

Re: (Solr-UIMA) Indexing problems with UIMA fields.

2011-07-13 Thread Tommaso Teofili
Hello, I think the problem might be the following, if you defined the update request handlers like in the sample solrconfig : updateRequestProcessorChain name=uima processor class=org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory lst name=uimaConfig

Re: Different Indexing formats for Older Lucene versions and Solr?

2011-07-05 Thread Tommaso Teofili
Which Lucene version were you using? Regards, Tommaso 2011/7/5 Sowmya V.B. vbsow...@gmail.com Hi All A quick doubt on the index files of Lucene and Solr. I had an older version of lucene (with UIMA) till recently, and had an index built thus. I shifted to Solr (3.3, with UIMA)..and tried

Re: Problems using Solr with UIMA

2011-07-04 Thread Tommaso Teofili
Hello Sowmya, Is the problem a ClassNotFoundException? If so check there exist a lib element referencing the solr-uima jar. Otherwise it may be some configuration error. By the way, which version of Solr are you using ? I ask since you're seeing README for trunk but you may be using Solr jars with

Re: Problems using Solr with UIMA

2011-07-04 Thread Tommaso Teofili
. Sowmya. On Mon, Jul 4, 2011 at 2:15 PM, Tommaso Teofili tommaso.teof...@gmail.comwrote: Hello Sowmya, Is the problem a ClassNotFoundException? If so check there exist a lib element referencing the solr-uima jar. Otherwise it may be some configuration error. By the way, which version

Re: Query time noun, verb boosting

2011-06-24 Thread Tommaso Teofili
2011/6/23 Anshum ansh...@gmail.com Pooja, You could use UIMA (or any other) Parts of Speech Tagger. You could read a little more about it here. http://uima.apache.org/downloads/sandbox/hmmTaggerUsersGuide/hmmTaggerUsersGuide.html#sandbox.tagger.annotatorDescriptor This would help you

Re: Showing facet of first N docs

2011-06-20 Thread Tommaso Teofili
playing a bit with dismax and bq. I think the problem is just in how the facets are being used, I think a customized SpellChecker sounds like the right component to provide smart suggestions. 2011/6/20 Toke Eskildsen t...@statsbiblioteket.dk On Thu, 2011-06-16 at 12:39 +0200, Tommaso Teofili wrote

Showing facet of first N docs

2011-06-16 Thread Tommaso Teofili
Hi all, Do you know if it is possible to show the facets for a particular field related only to the first N docs of the total number of results? It seems facet.limit doesn't help with it as it defines a window in the facet constraints returned. Thanks in advance, Tommaso

Re: Showing facet of first N docs

2011-06-16 Thread Tommaso Teofili
value is 0. This parameter can be specified on a per field basis. Dmitry On Thu, Jun 16, 2011 at 1:39 PM, Tommaso Teofili tommaso.teof...@gmail.comwrote: Hi all, Do you know if it is possible to show the facets for a particular field related only to the first N docs of the total number

Re: [Mahout] Integration with Solr

2011-06-09 Thread Tommaso Teofili
Hello Adam, I've managed to create a small POC of integrating Mahout with Solr for a clustering task, do you want to use it for clustering only or possibly for other purposes/algorithms? More generally speaking, I think it'd be nice if Solr could be extended with a proper API for integrating

Re: How can I query mutlitcore with solrJ

2011-05-20 Thread Tommaso Teofili
Or, if you want results from both together, you can use the distributed search [1]. Just decide which one of the cores will be the collector and add the shards=localhost:8983/solr/fund_dih,localhost:8983/solr/fund_tika parameter like : SolrServer server = new CommonsHttpSolrServer(

Re: UIMA analysisEngine path

2011-05-18 Thread Tommaso Teofili
? the UpdateRequestProcessorChain API has changed from 1.4.1 to 3.1.0 so, although it should be easy to back port, it's not compatible with Solr 1.4.1 out of the box. Tommaso Thanks again On Tue, May 17, 2011 at 12:13 PM, Tommaso Teofili [via Lucene] ml-node+2952043-2093755785-399...@n3

Re: UIMA analysisEngine path

2011-05-17 Thread Tommaso Teofili
PM, Tommaso Teofili [via Lucene] ml-node+2948866-1333438441-399...@n3.nabble.com wrote: The error you pasted doesn't seem to be related to a (class)path issue but more likely to be related to a Solr instance at 1.4.1/3.1.0 and Solr-UIMA module at 3.1.0/4.0-SNAPSHOT(trunk); it seems

Re: UIMA analysisEngine path

2011-05-16 Thread Tommaso Teofili
Hello, if you want to take the descriptor from a jar, provided that you configured the jar inside a lib element in solrconfig, then you just need to write the correct classpath in the analysisEngine element. For example if your descriptor resides in com/something/desc/ path inside the jar then

Re: UIMA analysisEngine path

2011-05-16 Thread Tommaso Teofili
, 2011 at 9:17 AM, Tommaso Teofili [via Lucene] ml-node+2946920-843126873-399...@n3.nabble.com wrote: Hello, if you want to take the descriptor from a jar, provided that you configured the jar inside a lib element in solrconfig, then you just need to write the correct classpath

Re: uima fieldMappings and solr dynamicField

2011-05-09 Thread Tommaso Teofili
Thanks Koji for opening that, the dynamicField mapping is a commonly used feature especially for named entities mapping. Tommaso 2011/5/7 Koji Sekiguchi k...@r.email.ne.jp I've opened https://issues.apache.org/jira/browse/SOLR-2503 . Koji -- http://www.rondhuit.com/en/ (11/05/06 20:15),

Re: UIMA analysisEngine path

2011-05-06 Thread Tommaso Teofili
the descriptor from the jar breaks that since OverridingParamsAEProvider uses the XMLInputSource method without relative path signature. Barry On 5/4/2011 6:16 AM, Tommaso Teofili wrote: Hello Barry, the main AnalysisEngine descriptor defined inside theanalysisEngine element should

Re: UIMA analysisEngine path

2011-05-06 Thread Tommaso Teofili
descriptors bundled inside the jars/pears but this addition sounds like a good improvement so, basically, let's do it ;-) Regards, Tommaso [1] : http://uima.apache.org/d/uimaj-2.3.1/api/org/apache/uima/util/XMLInputSource.html#XMLInputSource(java.net.URL) Barry On 5/6/2011 8:47 AM, Tommaso Teofili

Re: UIMA analysisEngine path

2011-05-04 Thread Tommaso Teofili
Hello Barry, the main AnalysisEngine descriptor defined inside the analysisEngine element should be inside one of the jars imported with the lib elements. At the moment it cannot be taken from expanded directories but it should be easy to do it (and indeed useful) modifying the

Re: solr- Uima integration

2011-04-19 Thread Tommaso Teofili
Hi Isha 2011/4/18 Isha Garg isha.g...@orkash.com Can anyone explain me the what are runtimeParameters specified in the uimaConfig as in link http://wiki.apache.org/solr/SolrUIMA. also tell me how to integrate our own analysis engine to solr. I am new to this. the runtimeParameters

Re: Viewing Raw index data

2011-04-19 Thread Tommaso Teofili
Hello Dave, the LukeRequestHandler [1] and the Analysis service [2] should help you : Regards, Tommaso [1] : http://wiki.apache.org/solr/LukeRequestHandler [2] : http://wiki.apache.org/solr/FAQ#My_search_returns_too_many_.2BAC8_too_little_.2BAC8_unexpected_results.2C_how_to_debug.3F 2011/4/19

AbstractSolrTestCase and Solr 3.1.0

2011-04-12 Thread Tommaso Teofili
Hi all, I am porting a previously series of Solr plugins developed for 1.4.1 version to 3.1.0, I've written some integration tests extending the AbstractSolrTestCase [1] utility class but now it seems that wasn't included in the solr-core 3.1.0 artifact as it's in the solr/src/test directory. Was

Re: AbstractSolrTestCase and Solr 3.1.0

2011-04-12 Thread Tommaso Teofili
Thanks Robert, that was very useful :) Tommaso 2011/4/12 Robert Muir rcm...@gmail.com On Tue, Apr 12, 2011 at 6:44 AM, Tommaso Teofili tommaso.teof...@gmail.com wrote: Hi all, I am porting a previously series of Solr plugins developed for 1.4.1 version to 3.1.0, I've written some

Re: UIMA example setup w/o OpenCalais

2011-04-08 Thread Tommaso Teofili
Hi Jay, you should be able to do so by simply removing the OpenCalaisAnnotator from the execution pipeline commenting the line 124 of the file: solr/contrib/uima/src/main/resources/org/apache/uima/desc/OverridingParamsExtServicesAE.xml Hope this helps, Tommaso 2011/4/7 Jay Luker

Re: boosting with standard search handler

2011-03-24 Thread Tommaso Teofili
Hi Gastone, I used to do that in standard search handler using the following parameters: q={!boost b=query($qq,0.7)} text:something title:other qq=date:[NOW-60DAY TO NOW]^5 OR date:[NOW-15DAY TO NOW]^8 that enabling custom recency based boosting. My 2 cents, Tommaso 2011/3/24 Gastone Penzo

Re: invert terms in search with exact match

2011-03-24 Thread Tommaso Teofili
Hi Gastone, I think you should use proximity search as described here in Lucene query syntax page [1]. So searching for my love~2 should work for your use case. Cheers, Tommaso [1] : http://lucene.apache.org/java/2_9_3/queryparsersyntax.html#ProximitySearches 2011/3/24 Gastone Penzo

Solr UIMA Wiki page

2011-03-09 Thread Tommaso Teofili
Hi all, I just improved the Solr UIMA integration wiki page [1] so if anyone is using it and/or has any feedback it'd be more than welcome. Regards, Tommaso [1] : http://wiki.apache.org/solr/SolrUIMA

Re: Use of multiple tomcat instance and shards.

2011-03-08 Thread Tommaso Teofili
Hi, from my experience when you have to scale in the number of documents it's good idea to use shards (so one schema and N shards containing (1/N)*total#docs) while if the requirement is granting high query volume response you could get a significant boost from replicating the same index on 2 or

Re: Use of multiple tomcat instance and shards.

2011-03-08 Thread Tommaso Teofili
Hi Rajani, i 2011/3/8 rajini maski rajinima...@gmail.com Tommaso, Please can you share any link that explains me about how to enable and do load balancing on the machines that you did mention above..? if you're querying Solr via SolrJ [1] you could use the LBHttpSolrServer [2]

Re: Use of multiple tomcat instance and shards.

2011-03-08 Thread Tommaso Teofili
Tommaso Teofili tommaso.teof...@gmail.com Hi Rajani, i 2011/3/8 rajini maski rajinima...@gmail.com Tommaso, Please can you share any link that explains me about how to enable and do load balancing on the machines that you did mention above..? if you're querying Solr via SolrJ [1] you

Re: Faceting

2011-02-21 Thread Tommaso Teofili
Hi Praveen, as far as I understand you have to set the type of the field(s) you are searching over to be conservative. So for example you won't include stemmer and lowercase filters and use only a whitespace tokenizer, more over you should search with the default operator set to AND. Then faceting

Re: Best way for a query-expander?

2011-02-18 Thread Tommaso Teofili
Hi Paul, me and a colleague worked on a QParserPlugin to expand alias field names to many existing field names ex: q=mockfield:val == q=actualfield1:val OR actualfield2:val but if you want to be able to use other params that come from the HTTP request you should use a custom RequestHandler I

Re: UIMA Error

2011-02-05 Thread Tommaso Teofili
Hi Darx, are you running it without an internet connection? As the problem seems to be that the OpenCalais service host cannot be resolved. Remember that you can select which UIMA annotators run inside the OverridingParamsAggregateAEDescriptor.xml. Hope this helps. Tommaso 2011/2/5, Darx Oman

Re: UIMA Error

2011-02-05 Thread Tommaso Teofili
Hi Darx, The other in the basis configuration is the AlchemyAPIAnnotator. Cheers, Tommaso 2011/2/6, Darx Oman darxo...@gmail.com: Hi Tommaso yes my server isn't connected to the internet. what other UIMA annotators that I can run which doesn't require an internet connection?

Re: solr - uima error

2011-01-30 Thread Tommaso Teofili
I found the issue is in the README.txt as the right class to use is UIMAUpdateRequestProcessorFactory, please change that in your solrconfig. Regards, Tommaso 2011/1/30 Darx Oman darxo...@gmail.com Hi I already copied apache-solr-uima-4.0-SNAPSHOT.jartosolr\lib but what causing the

Re: solr - uima error

2011-01-29 Thread Tommaso Teofili
Hi Darx you need to run 'and dist' under solr/contrib/uima and then reference the created jar (under solr/contrib/uima/build) inside the solrconfig.xml (lib tag) of your instance. Hope this helps, Tommaso 2011/1/29 Darx Oman darxo...@gmail.com I tried to do the uima integration with solr I

Re: Searchers and Warmups

2011-01-14 Thread Tommaso Teofili
Hi David, The idea is that you can define some listeners which make a list of queries to an IndexSearcher. In particular the firstSearcher event is related to the very first IndexSearcher being created inside the Solr instance while the newSearcher is the event related to the creation of a new

Solr and UIMA #2

2011-01-04 Thread Tommaso Teofili
Hi all, just a quick notice to let you know that a new component to consume UIMA objects to a (local or remote) Solr instance is available inside UIMA sandbox [1]. Note that this writes to Solr from UIMA pipelines (push) while in SOLR-2129 [2] Solr asks UIMA to extract metadata while indexing

Re: Problem with multicore

2010-12-15 Thread Tommaso Teofili
Hi Jörg, I think the first thing you should check is your Ubuntu's encoding, second one is file permissions (BTW why are you sudoing?). Did you try using the bash script under example/exampledocs named post.sh (use it like this: 'sh post.sh *.xml') Cheers, Tommaso 2010/12/15 Jörg Agatz

Parenthesis in query string

2010-12-15 Thread Tommaso Teofili
Hi all, I've just noticed a strange behavior (or, at least, I didn't expect that), when adding useless parenthesis to a query. Using the lucene query parser in Solr I get no results with the query: * ((( NOT (text:something))) AND date = 2010-12-15) * while I get the expected results when the

Re: Taxonomy and Faceting

2010-12-13 Thread Tommaso Teofili
With the SOLR-2129 patch you enable an Apache UIMA [1] pipeline to enrich documents being indexed. The base pipeline provided with the patch uses the following blocks (see OverridingParamsExtServicesAE.xml): nodeAggregateSentenceAE/node nodeOpenCalaisAnnotator/node

Re: Indexing documents with SOLR

2010-12-10 Thread Tommaso Teofili
Hi Pankaj, you can find the needed documentation right here [1]. Hope this helps, Tommaso [1] : http://wiki.apache.org/solr/ExtractingRequestHandler 2010/12/10 pankaj bhatt panbh...@gmail.com Hi All, I am a newbie to SOLR and trying to integrate TIKA + SOLR. Can anyone please guide me,

Re: Taxonomy and Faceting

2010-12-08 Thread Tommaso Teofili
Thanks Markus for helping with that, there are some changes in the configuration that need to be done. However I've just submitted a new patch at [1] which fix jar packaging and holds a README.txt which contains the following, it's very simple : 1. copy generated solr-uima jar and its libs

  1   2   >