RE: Re: solr.DateField: org.apache.solr.common.SolrException: Error while creating field

2010-09-15 Thread Dennis Gearon
Can you give us a scencario: 1/ Like a OOP sequence diagram, Thishappens, that happens, now that 2/ Where you see it useful? Isn't it possible to convert before storing/after retrieving? Couldn't a timezone offset (or local timezone designation) be stored as a separate field to

Re: Geographic clustering

2010-09-15 Thread Dennis Gearon
From what I can tell, it's being controlled in the browser. I CAN'T tell if it's being generated in the browser or in the server. Which is it in the example,and where to you want it generated? Do you want the DATA for the clusters, or the actual icons also? Looks like a display object way to

Re: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out : SingleInstanceLock: write.lock

2010-09-15 Thread Dennis Gearon
I saw something about having separate reader vs writer to an index. The email said that the reader had to do occasional (empty) commits to keep the cache warm and for another reason. Is this relevant? Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all

cloud or zookeeper

2010-09-15 Thread satya swaroop
Hi All, What is the difference of using shards,solr cloud and zookeeper.. which is the best way to scale the solr.. I need to reduce the index size in every system and reduce the search time for a query... Regards, satya

Problem with org.apache.solr.handler.component.SearchHandler

2010-09-15 Thread Michał Flasiński
Hi, When I use 1.4 version, I get exception: ERROR [SolrCore] java.lang.NullPointerException     at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)     at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)     at

Apache Hadoop Get Together Berlin October 2010 - this time with a huge Mahout focus

2010-09-15 Thread Isabel Drost
Hello, this is to announce the next Apache Hadoop Get Together sponsored by JTeam (http://www.jteam.nl) that will take place in newthinking store in Berlin. When: October 7th, 5p.m. Where: Newthinking store Berlin As always there will be slots of 30min each for talks on your Hadoop topic.

Re: Geographic clustering

2010-09-15 Thread Joe Chesak
Charlie, I hear you! I'm looking for that same functionality. This problem is bigger than it looks. Your single-dimension example is a good starting point. It makes sense that when the user asks for all widgets priced between $0 and $100 he gets that information in facets. You have a couple

Re: Geographic clustering

2010-09-15 Thread gwk
Hi Charlie, I think I understand what you mean, I had a similar requirement and this is what we made: http://www.mysecondhome.co.uk/search.html?view=map It allows full faceting on all fields the site allows in normal list search. Some information about my implementation is in my original

How will solr behave if data importing is called while another importing operation is still ongoing?

2010-09-15 Thread yklxmas
Hello everyone, I've just started using solr for one of my projects. I wonder if anyone could give me some advice on the approach we're taking. Basically we have a file system that have many xml files to be indexed by solr. However, users might make changes to the files by using another

Re: How to install DuplicatesDetectorService

2010-09-15 Thread hellboy
Is there possible to rewrite this code to Python: private static String getFuzzyHashing(MediaUnit unit) { TextProfileSignature tps = new TextProfileSignature(); // initialise with empty parameters to force default values of TextProfileSignature attributes

Re: How will solr behave if data importing is called while another importing operation is still ongoing?

2010-09-15 Thread yklxmas
Gora Mohanty-3 wrote: On Wed, Sep 15, 2010 at 2:31 PM, yklxmas yklx...@gmail.com wrote: [...] Basically we have a file system that have many xml files to be indexed by solr. However, users might make changes to the files by using another editorial system that will export xml to the file

Re: How will solr behave if data importing is called while another importing operation is still ongoing?

2010-09-15 Thread Gora Mohanty
On Wed, Sep 15, 2010 at 4:21 PM, yklxmas yklx...@gmail.com wrote: [...] I'm using standard data import handler with file data source and xpath processor. so my script will be calling http://host:8983/solr/dataimport?command=full-import I am not sure if you are aware of this, but unless you are

Re: Problem with org.apache.solr.handler.component.SearchHandler

2010-09-15 Thread Erick Erickson
What request did you submit when this happened? Because I don't think merely declaring the component matters unless you use it, so I doubt that'd make any difference... Best Erick 2010/9/15 Michał Flasiński michal.flasin...@hybris.de Hi, When I use 1.4 version, I get exception: ERROR

Re: How to install DuplicatesDetectorService

2010-09-15 Thread Erick Erickson
Have you looked at: http://wiki.apache.org/solr/Deduplication http://wiki.apache.org/solr/DeduplicationBest Erick On Wed, Sep 15, 2010 at 4:58 AM, hellboy pbon...@googlemail.com wrote: Is there possible to rewrite this code to Python: private static String getFuzzyHashing(MediaUnit unit) {

Solr returning irrelevant results

2010-09-15 Thread Nguyen, Vincent (CDC/OSELS/PHITPO) (CTR)
Hi, I was running a query on the word mining and got results from documents that have nothing to do with mining. I got results with a score of 0.2997284 and less. It looks like Solr was querying the dsm.fulltext field for mine as well, which is ok except there were no mine words in the

RE: Solr returning irrelevant results

2010-09-15 Thread Nguyen, Vincent (CDC/OSELS/PHITPO) (CTR)
Sorry about that, I made it uppercase to emphasize it. The word was just examined Vincent Vu Nguyen Division of Science Quality and Translation Office of the Associate Director for Science Centers for Disease Control and Prevention (CDC) 404-498-6154 Century Bldg 2400 Atlanta, GA 30329

Re: Color search for images

2010-09-15 Thread Ken Krugler
On Sep 15, 2010, at 7:59am, Shawn Heisey wrote: My index consists of metadata for a collection of 45 million objects, most of which are digital images. The executives have fallen in love with Google's color image search. Here's a search for flower with a red color filter:

Re: Color search for images

2010-09-15 Thread Shashi Kant
Shawn, I have done some research into this, machine-vision especially on a large scale is a hard problem, not to be entered into lightly. I would recommend starting with OpenCV - a comprehensive toolkit for extracting various features such as Color, Edge etc from images. Also there is a project

Re: Color search for images

2010-09-15 Thread Shashi Kant
On a related note, I'm curious if anyone has run across a good set of algorithms (or hopefully a library) for doing naive image classification. I'm looking for something that can classify images into something similar to the broad categories that Google image search has (Face, Photo, Clip

RE: Solr returning irrelevant results

2010-09-15 Thread Nguyen, Vincent (CDC/OSELS/PHITPO) (CTR)
Actually, I think I found the issue. Some of the PDFs weren't OCR'ed very well and the text from the word examined was read as ~8 mined Vincent Vu Nguyen Division of Science Quality and Translation Office of the Associate Director for Science Centers for Disease Control and Prevention (CDC)

Null Pointer Exception while indexing

2010-09-15 Thread andrewdps
What could be possible error for 14-Sep-10 4:28:47 PM org.apache.solr.common.SolrException log SEVERE: java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(libgcj.so.90) at java.util.concurrent.FutureTask.get(libgcj.so.90)

Re: Null Pointer Exception while indexing

2010-09-15 Thread Yonik Seeley
On Wed, Sep 15, 2010 at 1:12 PM, andrewdps mstpa...@gmail.com wrote: What could be possible error for 14-Sep-10 4:28:47 PM org.apache.solr.common.SolrException log SEVERE: java.util.concurrent.ExecutionException: java.lang.NullPointerException   at

Re: Boosting specific field value

2010-09-15 Thread Erick Erickson
This seems like a simple query-time boost, although I may not be understanding your problem well. That is, q=source(bbc OR associated press)^10 As for boosting more recent documents, see: http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents HTH Erick On

Re: Null Pointer Exception while indexing

2010-09-15 Thread andrewdps
I'm sorry,but how do I use that.Is that something to do with uninstalling gcu and installing jvm and openJDK? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Null-Pointer-Exception-while-indexing-tp1481154p1481285.html Sent from the Solr - User mailing list archive

Simple Filter Query (fq) Use Case Question

2010-09-15 Thread Andre Bickford
I'm working on creating a solr index search for a charitable organization. The solr index stores documents of donors. Each donor document has the following four fields: Id Name Address Gift Amount (multiValued) Gift Date (multiValued) In our relational database, there is a one-to-many

Re: Null Pointer Exception while indexing

2010-09-15 Thread andrewdps
I still get the same error when I try to index the mrc file... This was the previous version of the Java on our server. # java -version java version 1.5.0 gij (GNU libgcj) version 4.3.2 Copyright (C) 2007 Free Software Foundation, Inc. This is free software; see the source for copying

Null Pointer Exception while indexing

2010-09-15 Thread andrewdps
What could be possible error for 14-Sep-10 4:28:47 PM org.apache.solr.common.SolrException log SEVERE: java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(libgcj.so.90) at java.util.concurrent.FutureTask.get(libgcj.so.90)

Re: Simple Filter Query (fq) Use Case Question

2010-09-15 Thread Erick Erickson
One strategy is to denormalize all the way. That is, each Solr document is Gift Amount and Gift Date would not be multiValued. You'd create a different document for each gift, so you'd have multiple documents with the same Id, Name, and Address. Be careful, though, if you've defined Id as a

Re: Geographic clustering

2010-09-15 Thread Dennis Gearon
To me, it's a great idea. But I would prefer 2D areas superimposed over the map with a count per area, probably positioned near the median density point. Don't know of any applications that do this, or COULD do this, but intuitively, that feels like the right format. BTW, what is your usage

RE: Simple Filter Query (fq) Use Case Question

2010-09-15 Thread Andre Bickford
Thanks for the response Erick. I did actually try exactly what you suggested. I flipped the index over so that a gift is the document. This solution certainly solves the previous problem, but introduces a new issue where the search results show duplicate donors. If a donor gave 12 times in a

using variables/properties in dataconfig.xml

2010-09-15 Thread Jason Chaffee
Is it possible to use the same type of property configuration in dataconfig.xml as is possible in solrconfig.xml? I tried it and it didn't seem to work. For example, dataDir${solr.data.dir:/opt/search/store/solr/data}/dataDir And in the dataconfig.xml, I would like to do this to

Re: Simple Filter Query (fq) Use Case Question

2010-09-15 Thread Jonathan Rochkind
I might consider what Erick suggested to actually be 'normalization' rather than de-normalization! It's just that in Solr you only get one 'table'. Here's yet another approach, which will have it's own trade-offs: Keep the document as it is, representing a donor. But in addition to indexing

Re: Simple Filter Query (fq) Use Case Question

2010-09-15 Thread Jonathan Rochkind
Okay if you _only_ need to offer full years as facet drill-downs, not within a year, and not multiple years at once, you could index: -amount as a token in a multi-valued field. And zero-pad amount out to a buncha digits. 2006-0200 2007-1000 (big doner!) Now you could find

Re: Geographic clustering

2010-09-15 Thread Dennis Gearon
Nice work! I like the squares a lot better than the other style, for some reason. What blows my mind is how many second homes for sale there are in the Grand Caymans. WOW! Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and

Re: Simple Filter Query (fq) Use Case Question

2010-09-15 Thread Andre Bickford
Hi Jonathan, Thank you very much for your creative suggestions. I also wondered if perhaps combining giftDate and giftAmount into one single token was a possible solution. I'll definitely explore this further using your ideas. I especially like your idea of combing the giftDate and giftAmount

Re: Boosting specific field value

2010-09-15 Thread Ravi Kiran
Erick, I afraid you misinterpreted my issueif I query like you said i.e q=source(bbc OR associated press)^10 I will ONLY get documents with source BBC or Associated Press...what I am asking is - if my query query does not deal with source at all but uses some other field...since the

RE: Boosting specific field value

2010-09-15 Thread Jonathan Rochkind
Maybe you are looking for the 'bq' (boost query) parameter in dismax? http://wiki.apache.org/solr/DisMaxQParserPlugin#bq_.28Boost_Query.29 From: Ravi Kiran [ravi.bhas...@gmail.com] Sent: Wednesday, September 15, 2010 10:02 PM To:

Re: Color search for images

2010-09-15 Thread Dennis Gearon
My guess is that they are leveraging text on the same web page. I'm sure there's some post doctoral types who could get a graphic shape analyzer, color analyzer, to at least say it's a flower. However, even Google would have to build new datacenters to have the horsepower to do that kind of

Re: Color search for images

2010-09-15 Thread Shashi Kant
I'm sure there's some post doctoral types who could get a graphic shape analyzer, color analyzer, to at least say it's a flower. However, even Google would have to build new datacenters to have the horsepower to do that kind of graphic processing. Not necessarily true. Like.com - which

Re: Boosting specific field value

2010-09-15 Thread Ravi Kiran
Hello Mr.Rochkind, I am using StandardRequestHandler so I presume I cannot use bq param right ?? Is there a way we can mix dismax and standardhandler i.e use lucene syntax for query and use dismax style for bq using localparams/nested queries? I remember seeing your post