Re: change get to post ??

2010-04-13 Thread Michael Kuhlmann
Hi, the problem is not the GET request type, the problem is that you build a far too complicated query. This won't scale very much and looks rather weird. Why don't you just add all parent category ids to every document at index time? Then you could simply filter your request with the topmost

Re: change get to post ??

2010-04-13 Thread Michael Kuhlmann
You need to change the way how your data is imported. Or look for an alternative how to build your query. It depends on your data model, and your import mechanism. Do your really have hundreds of categories? BTW, childs is amusing! ;-) -Michael Am 13.04.2010 14:12, schrieb stockii: hi. thx

Re: change get to post ??

2010-04-13 Thread Michael Kuhlmann
Hi, Am 13.04.2010 14:52, schrieb stockii: some cat, have 300 child-categories. And that's the reason why you shouldn't add them all to your filter query. or, how can i import the cat-data ? Again: How do you do it NOW? -Michael

Re: change get to post ??

2010-04-13 Thread Michael Kuhlmann
I wouldn't do autosuggestion with normal queries anyway. Because of better performance... :-) I don't use DIH, so I can't tell what to do then. For us, we import data with a simple PHP script, which was rather easy to write. So we have full control on Solr's data structure. You somehow have to

Re: Turn off request logging for some handlers?

2010-04-15 Thread Michael Kuhlmann
Am 15.04.2010 17:45, schrieb Shawn Heisey: Is it possible to turn off request logging for some handlers? Specifically, I'd like to stop logging requests to /admin/ping and /replication, which get hit very often. Hi, you can set logging for nearly every single task here:

Re: is solr ignored my filters ?

2010-04-19 Thread Michael Kuhlmann
Am 19.04.2010 16:09, schrieb stockii: so i want to see how it is indexed. Go to the admin panel, open the schema browser, and set the number of shown tokens to 1 or something. -Michael

Re: is solr ignored my filters ?

2010-04-19 Thread Michael Kuhlmann
Am 19.04.2010 16:29, schrieb stockii: oha, yes thx but we have 800 000 items ... to find the right in this way ? XD Then use the TermsComponent: http://wiki.apache.org/solr/TermsComponent -Michael

Re: Best Open Source

2010-04-20 Thread Michael Kuhlmann
Nice site. Really! In addition to Dave: How do I search with tags enabled? If I search for Blog, I can see that there's one blog software written in Java. When I click on the Java tag, then my search is discarded, and I get all Java software. when I do my search again, the tag filter is lost. It

Re: SpellChecking

2010-05-03 Thread Michael Kuhlmann
Am 03.05.2010 16:43, schrieb Jan Kammer: Hi, It worked fine with a normal field. There must something wrong with copyfield, or why does dataimporthandler add/update no more documents? Did you define your destination field as multivalue? -Michael

Re: Score cutoff

2010-05-04 Thread Michael Kuhlmann
Am 03.05.2010 23:32, schrieb Satish Kumar: Hi, Can someone give clues on how to implement this feature? This is a very important requirement for us, so any help is greatly appreciated. Hi, I just implemented exactly this feature. You need to patch Solr to make this work. We at Zalando

Re: how to achieve filters

2010-05-18 Thread Michael Kuhlmann
Am 18.05.2010 16:54, schrieb Ahmet Arslan: 2. Query=rock where bitrate128 where it should return only first and third docs where bitrate128 q=rockfq:bitrate:[* TO 128] for this bitrate field must be tint type. q=rockfq:bitrate:[* TO 127] would be better, because bitrate should be lower

Re: Solr 1.4 query fails against all fields, but succeed if field is specified.

2010-05-31 Thread Michael Kuhlmann
Am 31.05.2010 11:50, schrieb olivier sallou: Hi, I have created in index with several fields. If I query my index in the admin section of solr (or via http request), I get results for my search if I specify the requested field: Query: note:Aspergillus (look for Aspergillus in field note)

Re: Solr 1.4 query fails against all fields, but succeed if field is specified.

2010-05-31 Thread Michael Kuhlmann
Am 31.05.2010 12:36, schrieb olivier sallou: Is there any way to query all fields including dynamic ones? Yes, using the *:term query. (Please note that the asterisk should not be quoted.) To answer your question, we need more details on your Solr configuration, esp. the part of schema.xml that

Re: Many Tomcat Processes on Server ?!?!?

2010-06-02 Thread Michael Kuhlmann
Am 02.06.2010 16:13, schrieb Paul Libbrecht: Is your server Linux? In this case this is very normal.. any java application spawns many new processes on linux... it's not exactly bound to threads unfortunately. Uh, no. New threads in Java typically don't spawn new processes on OS level. I

Re: PHP output at a multiValued AND dynamicField

2010-06-02 Thread Michael Kuhlmann
Am 02.06.2010 16:15, schrieb Jörg Agatz: yes i done.. but i dont know how i get the information out of the big Array... They're simply the keys of a single response array.

Re: Many Tomcat Processes on Server ?!?!?

2010-06-02 Thread Michael Kuhlmann
Am 02.06.2010 16:39, schrieb Paul Libbrecht: This is impressive, I had this in any Linux I've been using: SuSE, Ubuntu, Debian, Mandrake, ... Maybe there's some modern JDK with a modern Linux where it doesn't happen? It surely is not one process per thread though. I'm not a linux thread

Re: PHP output at a multiValued AND dynamicField

2010-06-02 Thread Michael Kuhlmann
Am 02.06.2010 16:42, schrieb Jörg Agatz: i don't understand what you mean! Then you should ask more precisely.

Re: Auto-suggest internal terms

2010-06-03 Thread Michael Kuhlmann
The only solution without doing any custom work would be to perform a normal query for each suggestion. But you might get into performance troubles with that, because suggestions are typically performed much more often than complete searches. The much faster solution that needs own work would be

Re: Auto-suggest internal terms

2010-06-03 Thread Michael Kuhlmann
Am 03.06.2010 13:02, schrieb Andrzej Bialecki: ..., and deploy this index in a separate JVM (to benefit from other CPUs than the one that runs your Solr core) Every known webserver ist multithreaded by default, so putting different Solr instances into different JVMs will be of no use.

Re: Auto-suggest internal terms

2010-06-03 Thread Michael Kuhlmann
Am 03.06.2010 16:45, schrieb Andrzej Bialecki: You are right to a certain degree. Still, there are some contention points in Lucene/Solr, how threads are allocated on available CPU-s, and how the heap is used, which can make a two-JVM setup perform much better than a single-JVM setup given the

Analyzer for indexing only, not for queries

2010-03-12 Thread Michael Kuhlmann
Hi all, I have a field with some kind of category tree as a string. The format is like this: prefixfirstsecond#prefixotherfirstothersecond So, the document is categorized in two categories, separated by '#', and all categories start with the same prefix which I don't want to use. For

KeywordTokenizer for faceting; was: Re: Analyzer for indexing only, not for queries

2010-03-12 Thread Michael Kuhlmann
/AnalyzersTokenizersTokenFilters http://wiki.apache.org/solr/AnalyzersTokenizersTokenFiltersHTH Erick On Fri, Mar 12, 2010 at 3:00 AM, Michael Kuhlmann michael.kuhlm...@zalando.de wrote:

Re: KeywordTokenizer for faceting; was: Re: Analyzer for indexing only, not for queries

2010-03-12 Thread Michael Kuhlmann
Hi Erick, On 03/12/10 17:09, Erick Erickson wrote: What's confusing me is that another of my fields does not have any analyzers defined at all, and it's working fine without problems. Field or fieldType? ...one of my fields with a fieldtype that does not have any analyzer defined at all,

KeywordTokenizer for faceting gives too many results

2010-03-12 Thread Michael Kuhlmann
Hi, I have some fields that are only used for faceting, so they're only queried by facet results. No modification is needed, no lowercase, nothing. So the KeywordTokenizerFactory seems to be perfect for them. Alas, when the value contains spaces, I'm still getting too many results. I have a

Re: KeywordTokenizer for faceting gives too many results

2010-03-12 Thread Michael Kuhlmann
On 03/12/10 17:51, Ahmet Arslan wrote: try using Parenthesis with queries that contain more than one term. fq=label:(Aces+of+London) Otherwise defaultSearchField/defaultSearchField jumps in. defaultSearchField stuff is correct but I just realized that you need to use quotes in your

Re: RegexTransformer

2010-03-15 Thread Michael Kuhlmann
On 03/15/10 08:56, Shalin Shekhar Mangar wrote: On Mon, Mar 15, 2010 at 2:12 AM, blargy zman...@hotmail.com wrote: How would I go about splitting a column by a certain delimiter AND ignore all empty matches. [...] You will probably have to write a custom Transformer to remove empty values.

Re: Switching cores dynamically

2010-03-19 Thread Michael Kuhlmann
On 03/19/10 11:18, muneeb wrote: Hi, I have indexed almost 7 million articles on two separate cores, each with their own conf/ and data/ folder, i.e. they have their individual index. What I normally do is, use core0 for querying and core1 for any updates and once updates are finished i

Re: Relevancy and random sorting

2012-01-12 Thread Michael Kuhlmann
Does the random sort function help you here? http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html However, you will get some very old listings then, if it's okay for you. -Kuli Am 12.01.2012 14:38, schrieb Alexandre Rocco: Erick, This document already has a field

Re: java.net.SocketException: Too many open files

2012-01-24 Thread Michael Kuhlmann
Hi Jonty, no, not really. When we first had such problems, we really thought that the number of open files is the problem, so we implemented an algorithm that performed an optimize from time to time to force a segment merge. Due to some misconfiguration, this ran too often. With the result

Re: Bad Request (Solr + Weblogic + Oracle DB)

2012-02-02 Thread Michael Kuhlmann
Hi rzao! I think this is the problem: On 02.02.2012 13:59, rzoao wrote: UpdateRequest req = new UpdateRequest(); req.setAction(AbstractUpdateRequest.ACTION.COMMIT, false, false);

Re: Help:Solr can't put all pdf files into index

2012-02-09 Thread Michael Kuhlmann
I'd suggest that you check which documents *exactly* are missing in Solr index. Or find at least one that's missing, and try to figure out how this document differs from the other ones that can be found in Solr. Maybe we can then find out what exact problem there is. Greetings, -Kuli On

Re: Help:Solr can't put all pdf files into index

2012-02-09 Thread Michael Kuhlmann
I don't know much about Tika, but this seems to be a bug in PDFBox. See: https://issues.apache.org/jira/browse/PDFBOX-797 Yoz might also have a look at this: http://stackoverflow.com/questions/7489206/error-while-parsing-binary-files-mostly-pdf At least that's what I found when I googled the

Re: sort my results alphabetically on facetnames

2012-02-14 Thread Michael Kuhlmann
Hi! On 14.02.2012 13:09, PeterKerk wrote: I want to sort my results on the facetnames (not by their number of results). From the example you gave, I'd assume you don't want to sort by facet names but by facet values. Simply add facet.sort=index to your request; see

Re: Too many open files - lots of sockets

2012-03-14 Thread Michael Kuhlmann
I had the same problem, without auto-commit. I never really found out what exactly the reason was, but I think it was because commits were triggered before a previous commit had the chance to finish. We now commit after every minute or 1000 (quite large) documents, whatever comes first. And

Re: Sorting on non-stored field

2012-03-14 Thread Michael Kuhlmann
Am 14.03.2012 11:43, schrieb Finotti Simone: I was wondering: is it possible to sort a Solr result-set on a non-stored value? Yes, it is. It must be indexed, indeed. -Kuli

Re: Too many open files - lots of sockets

2012-03-14 Thread Michael Kuhlmann
Ah, good to know! Thank you! I already had Jetty under suspicion, but we had this failure quite often in October and November, when the bug was not yet reported. -Kuli Am 14.03.2012 12:08, schrieb Colin Howe: After some more digging around I discovered that there was a bug reported in jetty

Re: Master/Slave switch on teh fly. Replication

2012-03-16 Thread Michael Kuhlmann
Am 16.03.2012 15:05, schrieb stockii: i have 8 cores ;-) i thought that replication is defined in solrconfig.xml and this file is only load on startup and i cannot change master to slave and slave to master without restarting the servlet-container ?!?!?! No, you can reload the whole core at

Re: Maybe switching to Solr Cores

2012-03-16 Thread Michael Kuhlmann
Am 16.03.2012 16:42, schrieb Mike Austin: It seems that the biggest real-world advantage is the ability to control core creation and replacement with no downtime. The negative would be the isolation however the are still somewhat isolated. What other benefits and common real-world situations

Re: is the SolrJ call to add collection of documents a blocking function call ?

2012-03-20 Thread Michael Kuhlmann
Hi Ramdev, add() is a blocking call. Otherwise it had to start an own background thread which is not what a library like Solrj should do (how many threads at most? At which priority? Which thread group? How long keep them pooled?) And, additionally, you might want to know whether the

Re: RequestHandler versus SearchComponent

2012-03-23 Thread Michael Kuhlmann
Am 23.03.2012 10:29, schrieb Ahmet Arslan: I'm looking at the following. I want to (1) map some query fields to some other query fields and add some things to FL, and then (2) rescore. I can see how to do it as a RequestHandler that makes a parser to get the fields, or I could see making a

Re: RequestHandler versus SearchComponent

2012-03-23 Thread Michael Kuhlmann
Am 23.03.2012 11:17, schrieb Michael Kuhlmann: Adding an own SearchComponent after the regular QueryComponent (or better as a last-element) is goof ... Of course, I meant good, not goof! ;) Greetings, Kuli

Re: DIH NoClassFoundError.

2012-04-25 Thread Michael Kuhlmann
Am 25.04.2012 15:57, schrieb stockii: is it not fucking possible to import DIH !?!?!? WTF! It is fucking possible, you just need to either point your goddamn classpath to the data import handler jar in the contrib folders, or you have to add the appropriate contrib folder into the lib dir

Re: Boosting fields in SOLR using Solrj

2012-04-26 Thread Michael Kuhlmann
Am 26.04.2012 00:57, schrieb Joe: Hi, I'm using the solrj API to query my SOLR 3.6 index. I have multiple text fields, which I would like to weight differently. From what I've read, I should be able to do this using the dismax or edismax query types. I've tried the following: SolrQuery query =

Re: Dynamic creation of cores for this use case.

2012-04-26 Thread Michael Kuhlmann
Am 26.04.2012 16:17, schrieb pprabhcisco123: The use case is to create a core for each customer as well as partner . Since its very difficult to create cores statically in solr.xml file for all 4500 customers , is there any way to create the cores dynamically or on the fly. Yes there is.

Re: Bridge between Solr and NoSQL

2012-05-08 Thread Michael Kuhlmann
Am 08.05.2012 04:13, schrieb Jeff Schmidt: Francois: Check out DataStax Enterprise 2.0, Solr integrated with Cassandra: http://www.datastax.com/docs/datastax_enterprise2.0/search/index And, Solbase, Solr integrated with HBase: https://github.com/Photobucket/Solbase I'm sure there are others,

Re: Partition Question

2012-05-09 Thread Michael Kuhlmann
Am 08.05.2012 23:23, schrieb Lance Norskog: Lucene does not support more 2^32 unique documents, so you need to partition. Just a small note: I doubt that Solr supports more than 2^31 unique documents, as most other Java applications that use int values. Greetings, Kuli

Re: Field with attribut in the schema.xml ?

2012-05-10 Thread Michael Kuhlmann
Am 10.05.2012 14:33, schrieb Bruno Mannina: like that: field name=inventor-countryCH/field field name=inventor-countryFR/field but in this case Ioose the link between inventor and its country? Of course, you need to index the two inventors into two distinct documents. Did you mark those

Re: Field with attribut in the schema.xml ?

2012-05-10 Thread Michael Kuhlmann
I don't know the details of your schema, but I would create fields like name, country, street etc., and a field named role, which contains values like inventor, applicant, etc. How would you do it otherwise? Create only four documents, each fierld containing 80 mio. values? Greetings, Kuli

Re: Identify indexed terms of document

2012-05-11 Thread Michael Kuhlmann
Am 10.05.2012 22:27, schrieb Ahmet Arslan: It's possible to see what terms are indexed for a field of document that stored=false? One way is to use http://wiki.apache.org/solr/LukeRequestHandler Another approach is this: - Query for exactly this document, e.g. by using the unique field -

Re: Question about cache

2012-05-11 Thread Michael Kuhlmann
Am 11.05.2012 15:48, schrieb Anderson vasconcelos: Hi Analysing the solr server in glassfish with Jconsole, the Heap Memory Usage don't use more than 4 GB. But, when was executed the TOP comand, the free memory in Operating system is only 200 MB. The physical memory is only 10GB. Why machine

Re: Solr Shards multi core slower then single big core

2012-05-14 Thread Michael Kuhlmann
Am 14.05.2012 05:56, schrieb arjit: Thanks Erick for the reply. I have 6 cores which doesn't contain duplicated data. every core has some unique data. What I thought was when I read it would read parallel 6 cores and join the result and return the query. And this would be efficient then reading

Re: Solr Shards multi core slower then single big core

2012-05-14 Thread Michael Kuhlmann
Am 14.05.2012 13:22, schrieb Sami Siren: Sharding is (nearly) always slower than using one big index with sufficient hardware resources. Only use sharding when your index is too huge to fit into one single machine. If you're not constrained by CPU or IO, in other words have plenty of CPU cores

Re: Solr Shards multi core slower then single big core

2012-05-14 Thread Michael Kuhlmann
Am 14.05.2012 16:18, schrieb Otis Gospodnetic: Hi Kuli, In a client engagement, I did see this (N shards on 1 beefy box with lots of RAM and CPU cores) be faster than 1 big index. I want to believe you, but I also want to understand. Can you explain why? And did this only happen for single

Re: org.apache.solr.common.SolrException: ERROR: [doc=null] missing required field: id

2012-05-21 Thread Michael Kuhlmann
Am 21.05.2012 12:07, schrieb Tolga: Hi, I am getting this error: [doc=null] missing required field: id [...] I've got this entry in schema.xml: field name=id type=string stored=true indexed=true/ What to do? Simply make sure that every document you're sending to Solr contains this id

Re: org.apache.solr.common.SolrException: ERROR: [doc=null] missing required field: id

2012-05-21 Thread Michael Kuhlmann
Am 21.05.2012 12:40, schrieb Tolga: How do I verify it exists? I've been crawling the same site and it wasn't giving an error on Thursday. It depends on what you're doing. Are you using nutch? -Kuli

Re: Stopword filter - refreshing stop word list periodically

2011-10-14 Thread Michael Kuhlmann
Am 14.10.2011 15:10, schrieb Jithin: Hi, Is it possible to refresh the stop word list periodically say once in 6 hours. Is this already supported in Solr or are there any work arounds. Kindly help me in understanding this. Hi, you can trigger a reload command to the core admin, assuming

Re: prefix search

2011-10-25 Thread Michael Kuhlmann
I think what Radha Krishna (is this really her name?) means is different: She wants to return only the matching token instead of the complete field value. Indeed, this is not possible. But you could use highlighting (http://wiki.apache.org/solr/HighlightingParameters), and then extract the

Re: Query/Delete performance difference between straight HTTP and SolrJ

2011-10-26 Thread Michael Kuhlmann
Hi, Am 25.10.2011 23:53, schrieb Shawn Heisey: On 10/20/2011 11:00 AM, Shawn Heisey wrote: [...] I've noticed a performance discrepancy when processing every one of my delete records, currently about 25000 of them. I din't understand what a delete record is. Do you delete records in Solr?

Re: java.net.SocketException: Too many open files

2011-10-26 Thread Michael Kuhlmann
Hi; we have a similar problem here. We already raised the file ulimit on the server to 4096, but this only defered the problem. We get a TooManyOpenFilesException every few months. The problem has nothing to do with real files. When we had the last TooManyOpenFilesException, we investigated with

Re: Query/Delete performance difference between straight HTTP and SolrJ

2011-10-27 Thread Michael Kuhlmann
Am 26.10.2011 18:29, schrieb Shawn Heisey: For inserting, I do use a Collection of SolrInputDocuments. The delete process grabs values from idx_delete, does a query like the above (the part that's slow in Java), then if any documents are found, issues a deleteByQuery with the same string.

Re: Query/Delete performance difference between straight HTTP and SolrJ

2011-10-27 Thread Michael Kuhlmann
Sorry, I was wrong. Am 27.10.2011 09:36, schrieb Michael Kuhlmann: and you'll get the number of affected documents in your response anyway. That's not true, you don't get the affected document count. Anyway, it's still true that you don't need to check for documents first, at least not when you

Re: Always return total number of documents

2011-10-28 Thread Michael Kuhlmann
Am 28.10.2011 11:16, schrieb Robert Brown: Is there no way to return the total number of docs as part of a search? No, it isn't. Usually this information is of absolutely no value to the end user. A workaround would be to add some field to the schema that has the same value for every document,

Re: Is SQL Like operator feature available in Apache Solr query

2011-11-01 Thread Michael Kuhlmann
Hi, this is not exactly true. In Solr, you can't have the wildcard operator on both sides of the operator. However, you can tokenize your fields and simply query for Solr. This is what's Solr made for. :) -Kuli Am 01.11.2011 13:24, schrieb François Schiettecatte: Arshad Actually it is

Re: Is SQL Like operator feature available in Apache Solr query

2011-11-01 Thread Michael Kuhlmann
Am 01.11.2011 16:06, schrieb Erick Erickson: NGrams are often used in Solr for this case, but they will also add to your index size. It might be worthwhile to look closely at your user requirements before going ahead and supporting this functionality Best Erick My opinion. Wildcards are

Re: representing latlontype in pojo

2011-11-09 Thread Michael Kuhlmann
Am 08.11.2011 23:38, schrieb Cam Bazz: How can I store a 2d point and index it to a field type that is latlontype, if I am using solrj? Simply use a String field. The format is $latitude,$longitude. -Kuli

Re: Solr 3.3 Sorting is not working for long fields

2011-11-14 Thread Michael Kuhlmann
Am 14.11.2011 09:33, schrieb rajini maski: query : http://localhost:8091/Group/select?/indent=onq=studyid:120sort=studyidasc,groupid asc,subjectid ascstart=0rows=10 Is it a copy-and-paste error, or did you realls sort on studyidasc? I don't think you have a field studyidasc, and Solr

Re: two word phrase search using dismax

2011-11-15 Thread Michael Kuhlmann
Am 14.11.2011 21:50, schrieb alx...@aim.com: Hello, I use solr3.4 and nutch 1.3. In request handler we have str name=mm2lt;-1 5lt;-2 6lt;90%/str As fas as I know this means that for two word phrase search match must be 100%. However, I noticed that in most cases documents with both words are

Re: creating solr index from nutch segments, no errors, no results

2011-11-15 Thread Michael Kuhlmann
I don't know much about nutch, but it looks like there's simply a commit missing at the end. Try to send a commit, e.g by executing curl http://host:port/solr/core/update -H Content-Type: text/xml --data-binary 'commit /' -Kuli Am 15.11.2011 09:11, schrieb Armin Schleicher: hi there,

Re: Solr 3.3 Sorting is not working for long fields

2011-11-15 Thread Michael Kuhlmann
Hi, Am 15.11.2011 10:25, schrieb rajini maski: fieldType name=long class=solr.TrieLongField precisionStep=0 omitNorms=true positionIncrementGap=0/ [...] fieldType name=tlong class=solr.TrieLongField precisionStep=8 omitNorms=true positionIncrementGap=0/ [...] field

Re: Problems installing Solr PHP extension

2011-11-16 Thread Michael Kuhlmann
Am 16.11.2011 17:11, schrieb Travis Low: If I can't solve this problem then we'll basically have to write our own PHP Solr client, which would royally suck. Oh, if you really can't get the library work, no problem - there are several PHP clients out there that don't need a PECL installation.

Re: Add copyTo Field without re-indexing?

2011-11-16 Thread Michael Kuhlmann
Am 17.11.2011 08:46, schrieb Kashif Khan: Please advise how we can reindex SOLR with having fields stored=false. we can not reindex data from the beginning just want to read and write indexes from the SOLRJ only. Please advise a solution. I know we can do it using lucene classes using

Re: Aggregated indexing of updating RSS feeds

2011-11-17 Thread Michael Kuhlmann
Am 17.11.2011 11:53, schrieb sbarriba: The 'params' logging pointer was what I needed. So for reference its not a good idea to use a 'wget' command directly in a crontab. I was using: wget http://localhost/solr/myfeed?command=full-importrows=5000clean=false :)) I think the shell handled the

Re: PatternTokenizer failure

2011-11-29 Thread Michael Kuhlmann
Am 29.11.2011 15:20, schrieb Erick Erickson: Hmmm, I tried this in straight Java, no Solr/Lucene involved and the behavior I'm seeing is that no example works if it has more than one whitespace character after the hyphen, including your failure example. I haven't lived inside regexes for long

Re: Best practise to automatically change a field value for a specific period of time

2011-12-02 Thread Michael Kuhlmann
Hi Mark, I'm sure you can manage this using function queries somehow, but this is rather complicated, esp. if you both want to return the price and sort on it. I'd rather update the index as soon as a campaign starts or ends. At least that's how we did it when I worked for online shops.

Re: SolR for time-series data

2011-12-05 Thread Michael Kuhlmann
Hi Alan, Solr can do this fast and easy, but I wonder if a simple key-value-store won't fit better for your suits. Do you really only need to query be chart_id, or do you also need to query by time range? In either case, as long as your data fits into an in-memory database, I would

Re: Replication not done for real on commit?

2011-12-05 Thread Michael Kuhlmann
Am 05.12.2011 14:28, schrieb Per Steffensen: Hi Reading http://wiki.apache.org/solr/SolrReplication I notice the pollInterval (guess it should have been pullInterval) on the slaves. That indicate to me that indexed information is not really pushed from master to slave(s) on events defined by

Re: Solr response writer

2011-12-07 Thread Michael Kuhlmann
Am 07.12.2011 14:26, schrieb Finotti Simone: That's the scenario: I have an XML that maps words W to URLs; when a search request is issued by my web client, a query will be issued to my Solr application. If, after stemming, the query matches any in W, the client must be redirected to the

Re: R: Solr response writer

2011-12-07 Thread Michael Kuhlmann
Am 07.12.2011 15:09, schrieb Finotti Simone: I got your and Michael's point. Indeed, I'm not very skilled in web devolpment so there may be something that I'm missing. Anyway, Endeca does something like this: 1. accept a query 2. does the stemming; 3. check if the result of the step 2.

Re: Copying few field using copyField to non multiValued field

2011-06-15 Thread Michael Kuhlmann
In addition to Bob's response: Am 15.06.2011 13:59, schrieb Omri Cohen: [...] field name=at_location type=text indexed=index stored=true required=false / field name=at_country type=text indexed=index stored=true required=false / field name=at_city

Re: Copying few field using copyField to non multiValued field

2011-06-16 Thread Michael Kuhlmann
Hi Omri, there are two limitations: 1. You can't sort on a multiValued field. (Anyway, on which of the copied fields would you want to sort first?) 2. You can't make the multiValued field the unique key. Both are no real limitations: 1. Better sort on at_country, at_state, at_city instead. 2.

Re: MultiValued facet behavior question

2011-06-22 Thread Michael Kuhlmann
Am 22.06.2011 05:37, schrieb Bill Bell: It can get more complicated. Here is another example: q=cardiologydefType=dismaxqf=specialties (Cardiology and cardiologist are stems)... But I don't really know which value in Cardiologist match perfectly. Again, I only want it to return:

Re: MultiValued facet behavior question

2011-06-22 Thread Michael Kuhlmann
Am 22.06.2011 09:49, schrieb Bill Bell: You can type q=cardiology and match on cardiologist. If stemming did not work you can just add a synonym: cardiology,cardiologist Okay, synonyms are the only way I can think of a realistic match. Stemming won't work on a facet field; you wouldn't get

Re: Inconsistent search results

2011-06-27 Thread Michael Kuhlmann
Am 27.06.2011 15:56, schrieb Jihed Amine Maaref: - normalizedContents:(EDOUAR* AND une) doesn't return anything This was discussed few days ago: http://lucene.472066.n3.nabble.com/Conflict-in-wildcard-query-and-spellchecker-in-solr-search-tt3095198.html - normalizedContents:(edouar* AND un)

Re: Include synonys in solr

2011-06-28 Thread Michael Kuhlmann
Am 28.06.2011 09:24, schrieb Romi: But as i suppose it would be very hard to include synonyms manually for each word as my application has large data. I want to know is there any way that this synonym.text file generate automatically referring to all dictionary words I don't get the point

Re: Regex replacement not working!

2011-06-29 Thread Michael Kuhlmann
Am 29.06.2011 12:30, schrieb samuele.mattiuzzo: fieldType name=salary_min_text class=solr.TextField analyzer type=index ... this is the final version of my schema part, but what i get is this: doc float name=score1.0/float str name=salaryNegotiable/str str

Re: updating existing data in index vs inserting new data in index

2011-07-07 Thread Michael Kuhlmann
Am 07.07.2011 16:14, schrieb Bob Sandiford: [...] (Without the optimize, 'deleted' records still show up in query results...) No, that's not true. The terms remain in the index, but the document won't show up any more. Optimize is only for performance (and disk space) optimization, as the

Re: updating existing data in index vs inserting new data in index

2011-07-07 Thread Michael Kuhlmann
Am 07.07.2011 16:52, schrieb Mark juszczec: Ok. That's really good to know because optimization of that kind will be important. Optimization is only important if you had a lot of deletes or updated docs, or if you want your segments get merged. (At least that's what I know about it.) What

Re: Average PDF index time

2011-07-12 Thread Michael Kuhlmann
Am 12.07.2011 12:03, schrieb alexander sulz: Still, why the PHP stops working correctly is beyond me, but it seems to be fixed now. You should mind the max_execution_time parameter in you php.ini. Greetings, Kuli

Re: Result list order in case of ties

2011-07-12 Thread Michael Kuhlmann
Am 12.07.2011 12:13, schrieb Lox: Hi, In the case where two or more documents are returned with the same score, is there a way to tell Solr to sort them alphabetically? Yes, add the parameter sort=score desc,your_field_that_shall_be_sorted_alphabetically asc to your request. Greetings,

Re: Can I still search documents once updated?

2011-07-13 Thread Michael Kuhlmann
Am 13.07.2011 14:05, schrieb Gabriele Kahlout: this is what i was expecting. Otherwise updating a field of a document that has an unstored but indexed field is impossible (without losing the unstored but indexed field. I call this updating a field of a document AND deleting/updating all its

Re: Can I still search documents once updated?

2011-07-13 Thread Michael Kuhlmann
Am 13.07.2011 15:37, schrieb Gabriele Kahlout: Well, I'm !sure how usual this scenario would be: 1. In general those using solr with nutch don't store the content field to avoid storing the whole web/intranet in their index, twice (1 in the form of stored data, and one in the form of indexed

Re: Can I still search documents once updated?

2011-07-13 Thread Michael Kuhlmann
Am 13.07.2011 16:09, schrieb Gabriele Kahlout: Solr is already configured by default not to store more than a maxFieldLength anyway. Usually one stores content only to display snippets. Yes, but the snippets must come from somewhere. For instance, if you're using Solr's highlighting feature,

LockObtainFailedException and open finalizing IndexWriters

2011-07-18 Thread Michael Kuhlmann
Hi, we are running Solr 3.2.0 on Jetty for a web application. Since we just went online and are still in beta tests, we don't have very much load on our servers (indeed, they're currently much oversized for the current usage), and our index size on file system ist just 1.1 MB. We have one

Re: Logically equivalent queries but vastly different no of results?

2011-07-22 Thread Michael Kuhlmann
Am 22.07.2011 14:27, schrieb cnyee: I think I know what it is. The second query has higher scores than the first. The additional condition domain_ids:(0^1.3 OR 1) which evaluates to true always - pushed up the scores and allows a LOT more records to pass. This can't be, because the score

Re: How many doc/doc in the XML source file before indexing?

2012-05-24 Thread Michael Kuhlmann
There is no hard limit for the maximum nunmber of documents per update. It's only memory dependent. The smaller each document, and the more memory Solr can acquire, the more documents can you send in one update. However, I wouldn't pish it too jard anyway. If you can send, say, 100 documents

Re: How many doc/doc in the XML source file before indexing?

2012-05-24 Thread Michael Kuhlmann
pish it too jard - sounds funny. :) I meant push it too hard. Am 24.05.2012 11:46, schrieb Michael Kuhlmann: There is no hard limit for the maximum nunmber of documents per update. It's only memory dependent. The smaller each document, and the more memory Solr can acquire, the more documents

Re: How many doc/doc in the XML source file before indexing?

2012-05-24 Thread Michael Kuhlmann
, no problem, I will check it and re-generate it. Is it bad to create a file with 5M doc ? Le 24/05/2012 11:46, Michael Kuhlmann a écrit : There is no hard limit for the maximum nunmber of documents per update. It's only memory dependent. The smaller each document, and the more memory Solr can

Re: Query elevation / boosting or something else to guarantee document position

2012-05-31 Thread Michael Kuhlmann
Hi Wenca, I'm a bit late. but maybe you're still interested. There's no such functionality in standard Solr. With sorting, this is not possible, because sort functions only rank each single document, they know nothing about the position of the others. And query elevation is similar, you'll

Re: ERROR 400 undefined field

2012-06-07 Thread Michael Kuhlmann
Am 07.06.2012 09:55, schrieb sheethal shreedhar: http://localhost:8983/solr/select/?q=fruitversion=2.2start=0rows=10indent=on I get HTTP ERROR 400 Problem accessing /solr/select/. Reason: undefined field text Look at your schema.xml. You'll find a line like this:

Re: timeAllowed flag in the response

2012-06-08 Thread Michael Kuhlmann
Hi Laurent, alas there is currently no such option. The time limit is handled by an internal TimeLimitingCollector, which is used inside SolrIndexSearcher. Since the using method only returns the DocList and doesn't have access to the QueryResult, it won't be easy to return this information

  1   2   >