SChema change with an additional copyField

2010-04-13 Thread Andrea Gazzarini
Hi, after indexing a lot of data I found that in my schema is missing the copyfield declaration for my spell field..:( The question is : do I have to reindex all the documents? I'm asking that because the new field is just a copy of an existing one and so I was wondering if SOLR is able to

Jetty, Tomcat or JBoss?

2010-04-17 Thread Andrea Gazzarini
Hi all, I have a web application which is basically a (user) search interface towards SOLR. My index is something like 7GB and has a lot of records so apart other things like optiming SOLR schema, config ,clustering etc... I'd like to keep SOLR installation as light as possible. At the moment

Re: schema.xml XSD/DTD

2010-05-05 Thread Andrea Gazzarini
The same for me, IMO I think it should be nice to have that Regards, Andrea Il 05/05/2010 11:58, Jon Poulton ha scritto: Morning all, I was wondering if anyone had written an XSD/DTD for schema.xml? A quick look at the wiki (http://wiki.apache.org/solr/SchemaXml) suggests that this has yet

diacritics on query string

2010-08-13 Thread Andrea Gazzarini
Hi, I have a problem regarding a diacritic character on my query string : *q=intertestualità * which is encoded in *q=intertestualit%E0 * What I'm not understanding is the following query response fragments : lst name=responseHeader int name=status0/int int name=QTime23/int lst

**SPAM** solr

2010-08-31 Thread Andrea Gazzarini
eh eh eh...it's a little bit hard to answer...could you provide some detail? cheers, Andrea hello all, I have indexed database using DIH. But I am not able to search the data using each field.

Re: Question about SOLR custom sort order

2012-01-01 Thread Andrea Gazzarini
We fullfilled a similar requirement by creating a new field that is populated at client-level (a standalone app that converts binary data in solr input documents) Andrea On 1/1/12, Erick Erickson erickerick...@gmail.com wrote: There's no good way of enforcing this as far as I know as you've

Re: Severe errors in solr configuration

2011-07-26 Thread Andrea Gazzarini
I don't know glassfish; the error you're reporting is a low-level security exception (method access) and doesn't seem to be related with web application (JAAS) security. Did you change the web.xml of solr war for including security constraints, security collections, login-config, roles and so

Re: Severe errors in solr configuration

2011-07-26 Thread Andrea Gazzarini
is working. Doing that in Jboss or tomcat is very simple. Regards, Andrea -Original Message- From: Andrea Gazzarini andrea.gazzar...@atcult.it Date: Tue, 26 Jul 2011 18:24:48 To: solr-user@lucene.apache.org; Xue-Feng Yangjust4l...@yahoo.com Reply-To: andrea.gazzar...@atcult.it Subject: Re

Re: how to update specific document (record) of solr

2011-10-31 Thread Andrea Gazzarini
Probably a stupid question...why is not possible to update stored and not indexed fields? Andrea On 10/31/11, Erick Erickson erickerick...@gmail.com wrote: No, you can't update individual fields. And you probably won't be able to unless Solr (well, Lucene actually) undergoes some *major*

ReplicationHandler with external indexes

2011-10-31 Thread Andrea Gazzarini
Hi all, I have a master/slave architecture synchronized using the built-in ReplicationHandler. As part of recent development we created an extension (a RequestHandler) that (on master), without going deeper in details, creates some foreign indexes in the data directory (in the same level of the

Re: Using Solr components for dictionary matching?

2011-11-03 Thread Andrea Gazzarini
Assuming that with dictionary you would mean (also) a thesaurus, you can consider to use SIREn which is a SOLR / Lucene add-on, able to index (and search) RDF data. In this way, you could index an already available thesaurus like LCSH, Agrovoc or build and index your own vocabulary.

Dismax, pf and qf

2011-11-14 Thread Andrea Gazzarini
Hi all, In my dismax request handler I'm usually using both qf and pf parameters in order to do phrse and query search with different boosting. Now there are some scenario when I want just the pf active (without qf). Othen then surrounding my query with double quotes, is there another way to do

boosting injection

2010-10-19 Thread Andrea Gazzarini
Hi all, I have a client that is sending this query q=title:history AND author:joyce is it possible to transform at runtime this query in this way: q=title:history^10 AND author:joyce^5 ? Best regards, Andrea

Re: **SPAM** Re: boosting injection

2010-10-19 Thread Andrea Gazzarini
, The Hitchhikers Guide to the Galaxy On Tue, Oct 19, 2010 at 8:48 AM, Andrea Gazzarini andrea.gazzar...@atcult.it wrote: Hi all, I have a client that is sending this query q=title:history AND author:joyce is it possible to transform at runtime this query in this way

Re: boosting injection

2010-10-19 Thread Andrea Gazzarini
, The Hitchhikers Guide to the Galaxy On Tue, Oct 19, 2010 at 8:48 AM, Andrea Gazzarini andrea.gazzar...@atcult.it wrote: Hi all, I have a client that is sending this query q=title:history AND author:joyce is it possible to transform at runtime this query in this way

Re: boosting injection

2010-10-19 Thread Andrea Gazzarini
04:23:46 pm Andrea Gazzarini wrote: Hi Ken, thanks for your response...unfortunately it doesn't solve my problem. I cannot chnage the client behaviour so the query must be a query and not only the query terms. In this scenario, It would be great, for example, if I could declare the boost

Re: boosting injection

2010-10-19 Thread Andrea Gazzarini
On Tue, Oct 19, 2010 at 10:23 AM, Andrea Gazzarini andrea.gazzar...@atcult.it wrote: Hi Ken, thanks for your response...unfortunately it doesn't solve my problem. I cannot chnage the client behaviour so the query must be a query and not only the query terms. In this scenario, It would be great

R: limit the search results to one category

2010-12-15 Thread Andrea Gazzarini
Did you try with filterquery? Andrea Gazzarini -Original Message- From: sara motahari saramotah...@yahoo.com Date: Tue, 14 Dec 2010 17:34:52 To: solr-user@lucene.apache.org Reply-To: solr-user@lucene.apache.org Subject: limit the search results to one category Hi all, I am using

Re: Java replication takes slaves down

2011-07-21 Thread Andrea Gazzarini
We are using a similar architecture but with two slaves, the index is around 9GB * and we don't have such problem... Each slave is running on a separate machine so we have three nodes in total (1 indexer + 2 searcher)...initially it was everything on a single node and it was working without

Tokenization at query time

2013-08-12 Thread Andrea Gazzarini
Hi all, I have a field (among others)in my schema defined like this: fieldtype name=mytype class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.*KeywordTokenizerFactory* / filter class=solr.LowerCaseFilterFactory / filter

Re: Tokenization at query time

2013-08-12 Thread Andrea Gazzarini
?q=myfield:Mag.\%20778\%20G\%2069debugQuery=on OR http://localhost:8983/solr/collection1/select?q=Mag.\%20778\%20G\%2069debugQuery=onqf=text%20myfielddefType=edismax I hope this helps Tanguy On Aug 12, 2013, at 11:13 AM, Andrea Gazzarini andrea.gazzar...@gmail.com wrote: Hi all, I have a field

Re: Tokenization at query time

2013-08-12 Thread Andrea Gazzarini
-Original Message- From: Andrea Gazzarini Sent: Monday, August 12, 2013 6:52 AM To: solr-user@lucene.apache.org Subject: Re: Tokenization at query time Hi Tanguy, thanks for fast response. What you are saying corresponds perfectly with the behaviour I'm observing. Now, other than having a big problem

Re: Tokenization at query time

2013-08-13 Thread Andrea Gazzarini
, Andrea Gazzarini andrea.gazzar...@gmail.com wrote: Clear, thanks for response. So, if I have two fields fieldtype name=type1 class=solr.TextField analyzer tokenizer class=solr.**KeywordTokenizerFactory* / filter class=solr.**LowerCaseFilterFactory / filter class

Re: Tokenization at query time

2013-08-13 Thread Andrea Gazzarini
Trying...thank you very much! I'll let you know Best, Andrea On 08/13/2013 04:18 PM, Erick Erickson wrote: I think you can get what you want by escaping the space with a backslash YMMV of course. Erick On Tue, Aug 13, 2013 at 9:11 AM, Andrea Gazzarini andrea.gazzar...@gmail.com wrote

SOLR memory usage (sort fields? replication?)

2013-08-13 Thread Andrea Gazzarini
Hi, I'm getting some Out of memory (heap space) from my solr instance and after investigating a little bit, I found several threads about sorting behaviour in SOLR. First, some information about the environment - I'm using SOLR 3.6.1 and master / slave architecture with 1 master and 2

Re: SOLR memory usage (sort fields? replication?)

2013-08-13 Thread Andrea Gazzarini
of title sort values for this mn On 08/13/2013 05:51 PM, Andrea Gazzarini wrote: Hi, I'm getting some Out of memory (heap space) from my solr instance and after investigating a little bit, I found several threads about sorting behaviour in SOLR. First, some information about the environment - I'm

Who's cleaning the Fieldcache?

2013-08-14 Thread Andrea Gazzarini
After doing some replications (replicationOnOptimize) I see - on master filesystem files that belong to two segments (I suppose the oldest is just a commit point) - on master admin console (SolrIndexReader{this=4f2452c6,r=ReadOnlyDirectoryReader@4f2452c6,refCnt=1,*segments=**1*}) but on

Re: Who's cleaning the Fieldcache?

2013-08-15 Thread Andrea Gazzarini
Hi Chris, Robert Thank you very much. First, answers to your questions: 1) which version of Solr are you using? 3.6.0 2) is it possibly you have multiple searchers open (ie: one in use while another one is warming up) when you're seeing these stats? No, no multiple searchers. Now, after

Re: Adding one core to an existing core?

2013-08-22 Thread Andrea Gazzarini
First, a core is a separate index so it is completely indipendent from the already existing core(s). So basically you don't need to reindex. In order to have two cores (but the same applies for n cores): you must have in your solr.home the file (solr.xml) described here

Re: UpdateProcessor not working with DIH, but works with SolrJ

2013-08-22 Thread Andrea Gazzarini
You should declare this str name=update.chainnohtml/str in the defaults section of the RequestHandler that corresponds to your dataimporthandler. You should have something like this: requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst

Re: UpdateProcessor not working with DIH, but works with SolrJ

2013-08-22 Thread Andrea Gazzarini
that has been deprecated in favor of update.chain so this shouldn't be the problem. Best, Gazza On 08/22/2013 05:57 PM, Shawn Heisey wrote: On 8/22/2013 9:42 AM, Andrea Gazzarini wrote: You should declare this str name=update.chainnohtml/str in the defaults section of the RequestHandler

Re: UpdateProcessor not working with DIH, but works with SolrJ

2013-08-22 Thread Andrea Gazzarini
Ok, found requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdih-config.xml/str str name=update.chain*nohtml***/str /lst /requestHandler Of course, my mistake...when I

Re: Solrconfig.xml

2013-08-23 Thread Andrea Gazzarini
Yes, if your RequestHandler implements SolrCoreAware you will get a SolrCore reference in inform(...) method. In SolrCore you have all what you need (specifically SolrResourceLoader is what you need) Note that if your request handler is a SearchHandler you don't need to implement that

Re: Index a database table?

2013-08-23 Thread Andrea Gazzarini
Seems ok assuming that - you have mysql driver jar in your $SOLR_HOME/lib - New is database name - user root / password is valid - table exists - SOLR has a schema with the following id and first_name fields declared About How do I know if they are wrong? Why don't you try? On 08/23/2013

Re: Index a database table?

2013-08-24 Thread Andrea Gazzarini
Actually I wanted every single step to be clear, thats why I asked. Now there is written: Ensure that your solr schema (schema.xml) has the fields 'id', 'name', 'desc'. Change the appropriate details in the data-config.xml My schema.xml is not having these fields. That means I have to

Re: Tokenization at query time

2013-08-26 Thread Andrea Gazzarini
wrote: I think you can get what you want by escaping the space with a backslash YMMV of course. Erick On Tue, Aug 13, 2013 at 9:11 AM, Andrea Gazzarini andrea.gazzar...@gmail.com wrote: Hi Erick, sorry if that wasn't clear: this is what I'm actually observing in my application. I wrote

Re: Tokenization at query time

2013-08-26 Thread Andrea Gazzarini
, Andrea Gazzarini andrea.gazzar...@gmail.com wrote: Hi Erick, escaping spaces doesn't work... Briefly, - In a document I have an ISBN field that (stored value) is *978-90-04-23560-1* - In the index I have this value: *9789004235601* Now, I want be able to search the document by using: 1) q=*978

Re: Tokenization at query time

2013-08-26 Thread Andrea Gazzarini
about what I did...I'm running my regression tests...all seems green...let's see But you know your problem space best Best, Erick Thank you very much Best, Gazza On Mon, Aug 26, 2013 at 9:04 AM, Andrea Gazzarini andrea.gazzar...@gmail.com wrote: Hi Erick, sorry I forgot the SOLR version

Re: Master / Slave Set Up Documentation

2013-08-26 Thread Andrea Gazzarini
You mean this http://wiki.apache.org/solr/SolrReplication ? What's wrong with this page? It seems clear. I'm widely using replication and the first time I set up a 1 master + 2 slaves by simply following that page On 26 Aug 2013 18:54, Jared Griffith jgriff...@picsauditing.com wrote: Hello,

Re: Solr 4.0 Functions in FL: performance?

2013-08-30 Thread Andrea Gazzarini
Hi, not actually sure I got the point but Are values calculated over the whole set of docs? Only over the resulting set of doc? Or, better, over the docs actually serialized in results. The third: a function is like a virtual field computed in real-time associated with each (returned) doc.

Re: solr performance against oracle

2013-09-04 Thread Andrea Gazzarini
You said nothing about your enviroments (e.g. operating systems, what kind of Oracle installation you have, whar kind of SOLR installation, how many data in database, how many documents in index, RAM for SOLR, for Oracle, for OS, and in general hardware...and so on)... Anyway...a migration

Re: json update moves doc to end

2013-12-03 Thread Andrea Gazzarini
AFAIK If you don't supply or configure a sort parameter, SOLR is sorting by score desc. In that case, you may want to understand (at least view) how each document score is calculated: you can run the query with queryDebug set and see the whole explain This great tool helped me a lot:

Re: SOLR 4 not utilizing multi CPU cores

2013-12-04 Thread Andrea Gazzarini
Hi, I did moreless the same but didn't get that behaviour...could you give us more details Best, Gazza On 5 Dec 2013 06:54, Salman Akram salman.ak...@northbaysolutions.net wrote: Hi, We recently upgraded to SOLR 4.6 from SOLR 1.4.1. Overall the performance went down for large phrase queries.

Re: Maven archetype

2013-12-06 Thread Andrea Gazzarini
Hi, if you want to deploy the SOLR war on tomcat you should do once so why do you need a maven archetype? You can just get the war from the website and deploy to your server. If you need to use maven because you are, for example, developing in eclipse and you want to just launch jetty:run

Re: war file deployment proble

2013-12-17 Thread Andrea Gazzarini
Hi, Is hard (at least to me) understand the relation between solr and your issue. Could you please explain? Best, Gazza On 17 Dec 2013 19:30, kumar pavan2...@gmail.com wrote: I have created a web application in windows using eclipse it is properly executing in windows environment. But when i

Re: Solr hanging when extracting a some broken .doc files

2013-12-17 Thread Andrea Gazzarini
Hi Augusto, I don't believe the mailing list allows attachments. Could you please post the complete stacktrace? In addition, set the logging level of tika classes to FINEST in solr console, maybe can be helpful Best, Andrea On 17 Dec 2013 16:30, Augusto Camarotti augu...@prpb.mpf.gov.br wrote:

Re: program termination in solrj

2013-12-21 Thread Andrea Gazzarini
Where's the error? On 21 Dec 2013 14:31, Nutan nutanshinde1...@gmail.com wrote: @Manish: i did add() but still the same error. *Logs* shows this: INFO: [document] webapp=/solr path=/update params={wt=javabinversion=2} {add=[23 (1455037990928646144)]} 0 5 Dec 21, 2013 6:56:01 PM

Re: program termination in solrj

2013-12-21 Thread Andrea Gazzarini
Hi Nutan, Is not really clear (at least to me) what your problem is. After you client program ends, are you seeing the doc in SOLR? because the piece of code you pasted I assume is (directly or indirectly) called from a main method...and therefore that program *normally* terminates once did its

Re: program termination in solrj

2013-12-21 Thread Andrea Gazzarini
But those are not errors...the first is red because is write on eclipse stderr...but you can see the INFO level The second is Eclipse debugger that tells you about sources not found...but is not an error...just click on the button, locate sources (if you have) and you will able to debug tjose

Re: program termination in solrj

2013-12-21 Thread Andrea Gazzarini
Source not found is not a problem and most important thing has nothing to do with solr Where did you see there are no doc in solr? Did you run a query? If so, - what query? - What was the response? - What is the config of the corresponding search handler? - Could you please verify that info in

Re: program termination in solrj

2013-12-21 Thread Andrea Gazzarini
How did you get those 14 docs indexed? Could you please post the default search handler config? On 21 Dec 2013 15:53, Nutan nutanshinde1...@gmail.com wrote: i check as with query: q=id:23, response is : numfound=0, Statictics is: Last Modified:16 minutes ago Num Docs:14 Max Doc:16

Re: Prevent indexing of several phrases

2013-12-21 Thread Andrea Gazzarini
I would do that on client side or in an UpdateRequestProcessor Andrea On 21 Dec 2013 09:06, Jorge Luis Betancourt González jlbetanco...@uci.cu wrote: Right now we have a custom use case: Basically we are using a separated solr core to store/suggest queries made by our users in our frontend app

Re: indexing .docx using solrj

2013-12-21 Thread Andrea Gazzarini
That class seems to be in xercesImpl jar...probably is a dependency of tika or a required lib of the underlying parser used for that kind of document Andrea On 21 Dec 2013 20:07, sweety sweetyshind...@yahoo.com wrote: i am trying to index .docx file using solrj, i referred this link:

Re: SolrJ 503 Error

2013-12-21 Thread Andrea Gazzarini
Not sure if we have the same scenario but I got the same error code when I was tryjng to do a lot of requests (updates and queries) with 10 secs of (hard) autocommit to a SOLR instance running in servlet engine (tomcat) with few resources (if I remember no more than 1GB of ram) Andrea Hi All, I

Re: program termination in solrj

2013-12-21 Thread Andrea Gazzarini
No you don't need to do that. Nutan, Andrea told me that is going to raise the white flag :D Another question: Is possible that your default search handler uses dismax / edismax and therefore the query id:23 is not valid and returns 0 docs? Another question: could you try to - get and post the

Re: indexing .docx using solrj

2013-12-21 Thread Andrea Gazzarini
That is not a jar for your (eclipse) compiler but for tomcat. You should have that jar available in tomcat or (better) in lib folder of your solr.home Eclipse doesn't need to rcognise that On 21 Dec 2013 21:15, sweety sweetyshind...@yahoo.com wrote: I have added that jar,in the build path. but

Re: indexing .docx using solrj

2013-12-21 Thread Andrea Gazzarini
Ok, then please tell us a bit more about your context: - versions (solr / java / tomcat) - where are tika libs? In solr.home lib or in tomcat lib? On 21 Dec 2013 21:15, sweety sweetyshind...@yahoo.com wrote: I have added that jar,in the build path. but the same error,i get. Why is eclipse not

Re: indexing .docx using solrj

2013-12-22 Thread Andrea Gazzarini
The error you were getting is a LinkageError so, simplifying, a class that was available at compile time is not there at runtime (again, very simplicistic definition because in this way this could be similar to a ClassNotFoundException...and isn't). Probably the class (and the jar) is there

Re: adding wild card at the end of the text and search(like sql like search)

2013-12-27 Thread Andrea Gazzarini
Hi Suren, You could try a textfield with a WordDelimiter filter + EdgeNGram filter (this latter only in the index analyzer). In this way your heading will be indexed as Jo Joh John Johns Johnsonso Johnsonson Johnsonsons and a query for Johnson so Will be translated into Johnsonso

Re: Empty facets on Solr with MySQL

2014-01-02 Thread Andrea Gazzarini
Then that means dih is not populating the field.. I guess if you set required=true in your field you will get some error during indexing Try to debug the index process and /or run queries outside solr in order to see results, field names matches and so on Best, Andrea I get Sorry, no Term Info

Re: Empty facets on Solr with MySQL

2014-01-02 Thread Andrea Gazzarini
Hi Peter, Hi Andrea, I changed it to: field name=cat_name required=true type=text indexed=true stored=true multiValued=true / When I run full-import 0 documents are indexed, but no errors in the console. That's the reason why you can't see facets and errors: 0 documents are indexed

Re: Empty facets on Solr with MySQL

2014-01-03 Thread Andrea Gazzarini
entire DB currently holds only 4 records, there's no need for a LIMIT clause I guess? Andrea Gazzarini-4 wrote In the solr console set to DEBUG / FINEST the level of DIH classes How do I do that? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Empty-facets

Re: Empty facets on Solr with MySQL

2014-01-03 Thread Andrea Gazzarini
no need for a LIMIT clause I guess? Andrea Gazzarini-4 wrote In the solr console set to DEBUG / FINEST the level of DIH classes How do I do that? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Empty-facets-on-Solr-with-MySQL-tp4109170p4109290.html Sent

Re: Empty facets on Solr with MySQL

2014-01-03 Thread Andrea Gazzarini
Hi Peter, Umfprtunately I deleted your first email where you wrote a piece of your schema...the problem seems to be cat_name and not cat_name_raw...could you please post again your schema? On 3 Jan 2014 13:40, PeterKerk vettepa...@hotmail.com wrote: Hi Andrea, You were right, I do see errors

Re: Empty facets on Solr with MySQL

2014-01-03 Thread Andrea Gazzarini
Hi Peter, I can only guess that the result set doesn't contain a cat_name (case insensitive) column. Other option / question: do you have a transformer (like scriptTransformer) that manipulates the resultset? You can debug the resultset in a main class by doing rs.getString (cat_name) Cheers,

Re: Empty facets on Solr with MySQL

2014-01-03 Thread Andrea Gazzarini
='${article.id}'; /entity I have no transformers on my resultset (I checked my querystring, schema.xml and data-config.xml, since I'm not even sure where it would have to be defined). Andrea Gazzarini-4 wrote You can debug the resultset in a main class by doing rs.getString

Re: Empty facets on Solr with MySQL

2014-01-03 Thread Andrea Gazzarini
I don't remember your dih-config.xml (could you post it again?) - remove the trailing ; from the query. It is a valid delimiter only when you run queries in mysql worlbench; - I assume there's a parent entity named (name=) article. - are you sure the column of the article entity is id (NB this

Re: Empty facets on Solr with MySQL

2014-01-03 Thread Andrea Gazzarini
Nice to hear you (not me) solved the problem. You're welcome Andrea On 3 Jan 2014 21:19, PeterKerk vettepa...@hotmail.com wrote: No need, you solved it! It was the id name, it had to be uppercase. btw the ; is still there in the query, but everything still works. Thanks! -- View this

Re: using extract handler: data not extracted

2014-01-11 Thread Andrea Gazzarini
Why is it so?? I'm reading your post on my mobile so probably I didn't get the point: other then the date_modified field, what is the problem? Fields with ignored prefix? That is perfectly right according with your configuration. The other fields you declared aren't there because they are not

Re: using extract handler: data not extracted

2014-01-11 Thread Andrea Gazzarini
Try to set to FINEST / DEBUG level the extract request handler and Tika packages and post relevant log lines On 11 Jan 2014 14:38, sweety sweetyshind...@yahoo.com wrote: Sorry, that my question was not clear. Initially when indexed pdf files it showed the data within this pdf in the contents

Re: using extract handler: data not extracted

2014-01-11 Thread Andrea Gazzarini
Set to Finest tika packages too On 11 Jan 2014 15:25, sweety sweetyshind...@yahoo.com wrote: I set the level of extract handler to finest, now the logs are : INFO: [document] webapp=/solr path=/update/extract params={commit=trueliteral.id=12debug=true} {add=[12 (1456944038966984704)],commit=}

Re: using extract handler: data not extracted

2014-01-11 Thread Andrea Gazzarini
On the admin console you should be able to tune the log at package level On 11 Jan 2014 17:31, sweety sweetyshind...@yahoo.com wrote: how set finest for tika package?? -- View this message in context:

Re: using extract handler: data not extracted

2014-01-12 Thread Andrea Gazzarini
Wait, don't confuse things...they should be three different issues: 1. with curl indexing happens but leaves the content field empty, so probably something occurs at tika level during the text extraction. That's the reason why I told you about the tika logging 2. with solrj ineexing doesn'happen

Re: using extract handler: data not extracted

2014-01-12 Thread Andrea Gazzarini
A premise: as Erik explained, most probably this issue has nothing to do with SOLR. So, these are the options that, in my mind, you have *OPTION #1 : Using Tika as command line tool*a) Download Tika. Make sure the same version of your SOLR b) Read here:

Re: using extract handler: data not extracted

2014-01-12 Thread Andrea Gazzarini
Please stay on (or clarify) your issue: in the first example you told us the problem is with Coding.pdf file. What is that Cloud.docx? Why don't you try with Coding.pdf? And what is the result of the extraction from command line with Coding.pdf and the same tika version that is in your SOLR? I

Re: using extract handler: data not extracted

2014-01-12 Thread Andrea Gazzarini
Not really sure...the issue seems related to text extraction so the first suspect is tika...SOLR is playing a secondary role here. If Tika is doing extraction good there should be an error, a warning on solr side (an exception, a content field too long warning or something like that) What about

Replication and conf files

2014-01-22 Thread Andrea Gazzarini
Hi all, Reading here http://wiki.apache.org/solr/SolrReplication#How_are_configuration_files_replicated.3F I don't understand what is the observed behaviour in case - confFiles contains schema.xml - schema doesn't change between replication cycles I mean, I read that the file is physically

Re: Changing contextRoot from /solr to /

2015-02-06 Thread Andrea Gazzarini
Sorry I didn't read your email carefully: the rename workaround doesn't work if you want to publish a webapp on / On 02/06/2015 02:51 PM, Andrea Gazzarini wrote: That config parameter is within the solrCloud section while you are talking about a standalone server. The context root of a webapp

Re: Changing contextRoot from /solr to /

2015-02-06 Thread Andrea Gazzarini
That parameter is within the solrCloud section, while you are talking about a standalone server. The context root of a webapp is not something you can configure within the webapp itself, each servlet engine / application server has its own way to do that. The simplest (but definitely no

Re: Easiest way to embed solr in a desktop application

2015-01-15 Thread Andrea Gazzarini
Hi Robert, I've used the EmbeddedSolrServer in a scenario like that and I never had problems. I assume you're talking about a standalone application, where the whole index resides locally and you don't need any cluster / cloud / distributed feature. I think the usage of EmbeddedSolrServer is

Re: Want to modify Solr Source Code

2015-03-17 Thread Andrea Gazzarini
Hi, if you followed what is written in the link that Gora suggested, you should have a workspace without errors. Eclipse compiler allows for incremental builds, that is, all code is incrementally compiled as soon as you finish typing. So if you inserted those lines and you don't see any error

Re: Connection pool shutdown error

2015-03-19 Thread Andrea Gazzarini
I bet the problem is how the SolrServer instance is used within Spring Repository. I think somewhere you should alternatively - explicitly close the client each time. - reuse the same instance (and finally close that) But being a Spring newbie I cannot give you further information. Best,

Re: Unable to perform search query after changing uniqueKey

2015-03-27 Thread Andrea Gazzarini
Hi Edwin, please provide some other detail about your context, (e.g. complete stacktrace, query you're issuing) Best, Andrea On 03/27/2015 09:38 AM, Zheng Lin Edwin Yeo wrote: Hi everyone, I've changed my uniqueKey to another name, instead of using id, on the schema.xml. However, after I

Re: Installing the auto-phrase-tokenfilter

2015-03-27 Thread Andrea Gazzarini
Hi, I never used that but I think you should - get the source code / clone the repository - run the ant build (I see a dist target) - put the artifact in your core / shared lib dir so Solr can see that library - have a look at the README [1] for how to use that Best, Andrea [1]

Re: Order of Copy Field and Analyzer

2015-04-23 Thread Andrea Gazzarini
Yes, the copied value is always the original one (the stored), regardless any analysis, which is field-scoped On 23 Apr 2015 19:13, Kaushik kaushika...@gmail.com wrote: Hello, What is the order in which these occur? - Copy field - Analyzer The other way of asking the above

Re: CDATA response is coming with lt: instead of

2015-04-21 Thread Andrea Gazzarini
It seems this is done in XML(Response)Writer: XML.escapeAttributeValue(stylesheet, writer); I suppose this is valid according with XML escaping rules, but it's just a thought of mine because I don't know so strictly those rules. I see the character is being escaped so what you get is coheren

Re: Solr + RDF = SolRDF

2015-04-28 Thread Andrea Gazzarini
Hi Charlie, definitely cool and interesting. Best, Andrea On 04/28/2015 10:20 AM, Charlie Hull wrote: On 27/04/2015 21:41, Andrea Gazzarini wrote: Hi guys, I'd like to share with you a project (actually a hobby for me) where I'm spending my free time, maybe someone could get some idea

Re: Solr + RDF = SolRDF

2015-04-28 Thread Andrea Gazzarini
, Andrea On 04/28/2015 04:26 PM, Davis, Daniel (NIH/NLM) [C] wrote: Both cool and interesting. Andrea, does your Solr RDF indexing project support inference? If so, is inference done by Jena or ahead of time before indexing by Solr? -Original Message- From: Andrea Gazzarini

Solr + RDF = SolRDF

2015-04-27 Thread Andrea Gazzarini
Hi guys, I'd like to share with you a project (actually a hobby for me) where I'm spending my free time, maybe someone could get some idea or benefit from it. https://github.com/agazzarini/SolRDF I called it SolRDF (Solr + RDF): It is a set of Solr extensions for managing (indexing and querying)

Re: AW: blocked in org.apache.solr.core.SolrCore.getSearcher(...) ?

2015-05-03 Thread Andrea Gazzarini
I'd look at the thread view in the admin console. That would give an idea about what the system is doing. You can get the same information from the command line using # jstack (pid) output.log Best, Andrea On 3 May 2015 18:53, Clemens Wyss DEV clemens...@mysign.ch wrote: Just opened the very

Re: Indexing PDF and MS Office files

2015-04-14 Thread Andrea Gazzarini
It seems something like https://issues.apache.org/jira/browse/TIKA-1251. I see you're using Solr 4.10.2 which uses Tika 1.5 and that issue seems to be fixed in Tika 1.6. I agree with Erik: you should try with another version of Tika. Best, Andrea On 04/14/2015 06:44 PM, Vijaya Narayana Reddy

Re: ContentTypes supported by Solr to index

2015-04-15 Thread Andrea Gazzarini
Hi Vijay, here you can find all supported formats by Tika, which is internally used by SolrCell: * https://tika.apache.org/*1.4*/formats.html * https://tika.apache.org/*1.5*/formats.html * https://tika.apache.org/*1.6*/formats.html * https://tika.apache.org/*1.7*/formats.html Best, Andrea

Re: ContentTypes supported by Solr to index

2015-04-15 Thread Andrea Gazzarini
? No error is thrown in the overall process and the java program completes successfully. But when I query the Solr UI, only 8 files are indexed. Attached is a simple screenshot of the files types I am trying to index. Thanks Regards Vijay On 15 April 2015 at 15:27, Andrea Gazzarini a.gazzar

Re: Indexing PDF and MS Office files

2015-04-14 Thread Andrea Gazzarini
Hi Vijay, Please paste an extract of your schema, where the content field (the field where the PDF text shoudl be) and its type are declared. For the other issue, please paste the whole stacktrace because org.apache.tika.parser.microsoft.OfficeParser* says nothing. The complete stacktrace (or

Re: Indexing PDF and MS Office files

2015-04-14 Thread Andrea Gazzarini
Hi, solrconfig.xml (especially if you didn't touch it) should be good. What about the schema? Are you using the one that comes with the download bundle, too? I don't see the stacktrace..did you forget to paste it? Best, Andrea On 04/14/2015 06:06 PM, Vijaya Narayana Reddy Bhoomi Reddy

Re: HttpSolrServer and CloudSolrServer

2015-04-17 Thread Andrea Gazzarini
If you're using SolrCloud then you should use CloudSolrServer as it is able to abstract / hide the interaction with the cluster. HttpSolrServer communicates directly with a Solr instance. Best, Andrea On 04/17/2015 10:59 AM, Vijay Bhoomireddy wrote: Hi All, Good Morning!! For

Re: sort by a copy field error

2015-04-14 Thread Andrea Gazzarini
Hi Pedro Please post the request that produces that error Andrea On 14 Apr 2015 19:33, Pedro Figueiredo pjlfigueir...@criticalsoftware.com wrote: Hello, I have a pretty basic question: how can I sort by a copyfield? My schema conf is: field name=name type=text_general_edge_ngram

Re: sort by a copy field error

2015-04-15 Thread Andrea Gazzarini
, Portugal T. +351 229 446 927 | F. +351 229 446 929 www.criticalsoftware.com PORTUGAL | UK | GERMANY | USA | BRAZIL | MOZAMBIQUE | ANGOLA A CMMI® LEVEL 5 RATED COMPANY CMMI® is registered in the USPTO by CMU -Original Message- From: Andrea Gazzarini [mailto:a.gazzar...@gmail.com] Sent: 14

Re: solrconfig.xml error

2015-04-08 Thread Andrea Gazzarini
Hi Pradeep, AFAIK the mailing list doesn't allow attachments. I think pasting the error should be enough Best, Andrea On 04/08/2015 09:02 AM, Pradeep wrote: We have installed solr-4.3.0 is our local but we are getting error. Please find attachment. And help us to fix this error. Thank You.

Re: Indexing problem

2015-06-06 Thread Andrea Gazzarini
Hi Midas, It seems there's a thread that is getting data from a database using DIH. No thread seems blocked and the throughput *could* be normal, depending on many factors. Could you please expand a bit about your context? Best, Andrea On 6 Jun 2015 12:21, Midas A test.mi...@gmail.com wrote:

  1   2   3   >