I'm thinking about modifying my index process to use json because all my
docs are originally in json anyway . Are there any performance issues if I
insert json docs instead of xml docs? A colleague recommended to me to
stay with xml because solr is highly optimized for xml.
I've been reading the solr source code and made modifications by
implementing a custom Similarity class.
I want to implement a weight to the score by multiplying a number
based on if the current doc has certain term in it.
So if the query was q=data_text:foo
then the Similiarity class would
that contains all user
documents that are active. This docset is used as filter during the
execution of the main query (q param),
so it only returns posts with the contain the text hello for active users.
Martijn
On 28 October 2011 01:57, Jason Toy jason...@gmail.com wrote:
Does anyone have
I've written a script that does bulk insertion from my database, it
grabs chunks of 500 docs (out of 100 million ) and inserts them into
solr over http. I have 5 threads that are inserting from a queue.
After each insert I issue a commit.
Every 20 or so inserts I get this error message:
Error:
I have a similar problem except I need to filter scores that are too high.
Robert Stewart bstewart...@gmail.com 於 Oct 27, 2011 7:04 AM 寫道:
BTW, this would be good standard feature for SOLR, as I've run into this
requirement more than once.
On Oct 27, 2011, at 9:49 AM,
Does anyone have any idea on this issue?
On Tue, Oct 25, 2011 at 11:40 AM, Jason Toy jason...@gmail.com wrote:
Hi Yonik,
Without a Join I would normally query user docs with:
q=data_text:testfq=is_active_boolean:true
With joining users with posts, I get no no results:
q={!join from
, but with the ability to join
with the Posts docs.
On Tue, Oct 25, 2011 at 11:30 AM, Yonik Seeley
yo...@lucidimagination.comwrote:
Can you give an example of the request (URL) you are sending to Solr?
-Yonik
http://www.lucidimagination.com
On Mon, Oct 24, 2011 at 3:31 PM, Jason Toy jason...@gmail.com
I have 2 types of docs, users and posts.
I want to view all the docs that belong to certain users by joining posts
and users together. I have to filter the users with a filter query of
is_active_boolean:true so that the score is not effected,but since I do a
join, I have to move the filter query
I have several different document types that I store. I use a serialized
integer that is unique to the document type. If I use id as the uniqueKey,
then there is a possibility to have colliding docs on the id, what would be
the best way to have a unique id given I am storing my unique identifier
I'm testing out the join functionality on the svn revision 1175424.
I've found when I add a single filter query to a join it works fine, but
when I do more then 1 filter query, the query does not return results.
This single function query with a join returns results:
Can dismax understand that query in a translated form?
在 Sep 29, 2011 10:01 PM 時,yingshou guo guoyings...@gmail.com 寫到:
you cann't use this kind of query syntax against dismax query parser.
your query can by understood by standard query parser or edismax query
parser. qt request parameter is
Hi all, I am testing various versions of solr from trunk, I am finding that
often times the example doesn't build and I can't test out the version. Is
there a resource that shows which versions build correctly so that we can
test it out?
Hi all,
I'd like to know what the specific disadvantages are for using dynamic
fields in my schema are? About half of my fields are dynamic, but I could
move all of them to be static fields. WIll my searches run faster? If there
are no disadvantages, can I just set all my fields to be dynamic?
I am running the sun version:
java version 1.6.0_26
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
I get multiple Out of memory exceptions looking at my application and the
solr logs, but my script doesn't get called the first
I had a join query that was originally written as :
{!join from=self_id_i to=user_id_i}data_text:hello
and that works fine. I later added an fq filter:
{!frange l=0.05 }div(termfreq(data_text,'hello'),max_i)
and the query doesn't work anymore. if I do the fq by itself without the
join the query
I have solr issues where I keep running out of memory. I am working on
solving the memory issues (this will take a long time), but in the meantime,
I'm trying to be notified when the error occurs. I saw with the jvm I can
pass the -XX:OnOutOfMemoryError= flag and pass a script to run. Every time
Anyone know the query I would do to get the join to work? I'm unable to get
it to work.
On Wed, Sep 14, 2011 at 10:49 AM, Jason Toy jason...@gmail.com wrote:
I've been reading the information on the new join feature and am not quite
sure how I would use it given my schema structure. I have
I've been reading the information on the new join feature and am not quite
sure how I would use it given my schema structure. I have User docs and
BlogPost docs and I want to return all BlogPosts that match the fulltext
title cool that belong to Users that match the description solr.
Here are the
I had queries breaking on me when there were spaces in the text I was
searching for. Originally I had :
fq=state_s:New York
and that would break, I found a work around by using:
fq={!raw f=state_s}New York
My problem now is doing this with an OR query, this is what I have now, but
it doesn't
I wrote the title wrong, its a filter query, not a function query, thanks
for the correction.
The field is a string, I had tried fq=stats_s:New York before and that
did not work, I'm puzzled to why this didn't work.
I tried out your b suggestion and that worked,thanks!
On Tue, Sep 13, 2011 at
I'd love to see the progress on this.
On Tue, Sep 13, 2011 at 10:34 AM, Roman Chyla roman.ch...@gmail.com wrote:
Hi,
The standard lucene/solr parsing is nice but not really flexible. I
saw questions and discussion about ANTLR, but unfortunately never a
working grammar, so... maybe you find
I'm trying to limit my data to only docs that have the word 'foo' appear at
least once.
I am trying to use:
fq=termfreqdata,'foo'):[1+TO+*]
but I get the syntax error:
Caused by: org.apache.lucene.queryparser.classic.ParseException: Encountered
: : at line 1, column 33.
Was expecting one of:
After running a combination of different queries, my solr server eventually
is unable to complete certain requests because it runs out of memory, which
means I need to restart the server as its basically useless with some
queries working and not others. I am moving to distributed setting soon,
I have a large ec2 instance(7.5 gb ram), it dies every few hours with out of
heap memory issues. I started upping the min memory required, currently I
use -Xms3072M .
I insert about 50k docs an hour and I currently have about 65 million docs
with about 10 fields each. Is this already too much
.
-Original Message-
From: Jason Toy [mailto:jason...@gmail.com]
Sent: Wednesday, August 17, 2011 5:15 PM
To: solr-user@lucene.apache.org
Subject: solr keeps dying every few hours.
I have a large ec2 instance(7.5 gb ram), it dies every few hours with out
of heap memory issues. I
I've only set set minimum memory and have not set maximum memory. I'm doing
more investigation and I see that I have 100+ dynamic fields for my
documents, not the 10 fields I quoted earlier. I also sort against those
dynamic fields often, I'm reading that this potentially uses a lot of
memory.
What can I do temporarily in this situation? It seems like I must eventually
move to a distributed setup. I am sorting on dynamic float fields.
On Wed, Aug 17, 2011 at 3:01 PM, Yonik Seeley yo...@lucidimagination.comwrote:
On Wed, Aug 17, 2011 at 5:56 PM, Jason Toy jason...@gmail.com wrote
I am trying to list some data based on a function I run ,
specifically termfreq(post_text,'indie music') and I am unable to do it
without passing in data to the q paramater. Is it possible to get a sorted
list without searching for any terms?
/8/8 Jason Toy jason...@gmail.com
I am trying to list some data based on a function I run ,
specifically termfreq(post_text,'indie music') and I am unable to do it
without passing in data to the q paramater. Is it possible to get a
sorted
list without searching for any terms
your index size
and
number of unique terms.
On Mon, Aug 8, 2011 at 1:08 PM, Alexei Martchenko
ale...@superdownloads.com.br wrote:
You can use the standard query parser and pass q=*:*
2011/8/8 Jason Toy jason...@gmail.com
I am trying to list some data based
choices effect what is possible at query time. Lucene In Action
is a pretty good book.
On 8/8/2011 5:02 PM, Jason Toy wrote:
Are not Dismax queries able to search for phrases using the default
index(which is what I am using?) If I can already do phrase searches,
I
don't understand
How can I run a query to get the result count only? I only need the count
and so I dont need solr to send me all the results back.
As I'm using solr more and more, I'm finding that I need to do searches and
then order by new criteria. So I am constantly add new fields into solr
and then reindexing everything.
I want to know if adding in all this data into solr is the normal way to
deal with sorting. I'm finding that I
How does one search for words with characters like # and +. I have tried
searching solr with #test and \#test but all my results always come up
with test and not #test. Is this some kind of configuration option I
need to set in solr?
--
- sent from my mobile
6176064373
In Solr 1.3.1 I am able to store timestamps in my docs so that I query them.
In trunk when I try to store a doc with a timestamp I get a sever error, is
there a different way I should store this data or is this a bug?
Jul 22, 2011 7:20:14 PM org.apache.solr.update.processor.LogUpdateProcessor
I haven't modified my schema in the older solr or trunk solr,is it required
to modify my schema to support timestamps?
On Fri, Jul 22, 2011 at 4:45 PM, Chris Hostetter
hossman_luc...@fucit.orgwrote:
: In Solr 1.3.1 I am able to store timestamps in my docs so that I query
them.
:
: In trunk
=solr.DateField sortMissingLast=true
omitNorms=true/
On Fri, Jul 22, 2011 at 5:00 PM, Jason Toy jason...@gmail.com wrote:
I haven't modified my schema in the older solr or trunk solr,is it required
to modify my schema to support timestamps?
On Fri, Jul 22, 2011 at 4:45 PM, Chris Hostetter
Hi Chris, you were correct, the filed was getting set as a double. Thanks
for the help.
On Fri, Jul 22, 2011 at 7:03 PM, Jason Toy jason...@gmail.com wrote:
This is the document I am posting:
?xml version=1.0 encoding=UTF-8?adddocfield name=idPost
75004824785129473/fieldfield name=typePost
According to that bug list, there are other characters that break the
sorting function. Is there a list of safe characters I can use as a
delimiter?
On Mon, Jul 18, 2011 at 1:31 PM, Chris Hostetter
hossman_luc...@fucit.orgwrote:
: When I try to sort by a column with a colon in it like
:
Hi all, I found a bug that exists in the 3.1 and in trunk, but not in 1.4.1
When I try to sort by a column with a colon in it like
scores:rails_f, solr has cutoff the column name from the colon
forward so scores:rails_f becomes scores
To test, I inserted this doc:
In 1.4.1 I was able to
actually prohibited, but that could
be your problem.
Nick
On 7/18/2011 8:10 AM, Jason Toy wrote:
Hi all, I found a bug that exists in the 3.1 and in trunk, but not in
1.4.1
When I try to sort by a column with a colon in it like
scores:rails_f, solr has cutoff the column name from
How does one search for the term google+ with solr? I noticed on twitter I
can search for google+: http://search.twitter.com/search?q=google%2B (which
uses lucene, not sure about solr) but searching on my copy of solr, I can't
search for google+
--
- sent from my mobile
6176064373
I am trying to use sorting by the termfreq function using the trunk code
since termfreq was added in the 4.0 code base.
I run this query:
http://127.0.0.1:8983/solr/select/?q=librariansort=termfreq(all_lists_text,librarian)%20desc
but I get:
HTTP ERROR 500
Problem accessing /solr/select/.
I'm trying to run the example app from the svn source, but it doesn't seem
to work. I am able to run :
java -jar start.jar
and Jetty starts with:
INFO::Started SocketConnector@0.0.0.0:8983
But then when I go to my browser and go to this address:
http://localhost:8983/solr/
I get a 404 error.
I am trying to use sorting by function on solr 3.2 and it doesn't now workt
with termfreq. I do this query:
/solr/select?q=testqf=all_lists_textdefType=dismaxsort=termfreq%28all_lists_text%2Ctest%29+descrows=50
I get this error:
Can't determine Sort Order: 'termfreq(description_text,'test')
Ahmet, that doesnt return the idf data in my results, unless I am
doing something wrong. When you run any function you get the results
of the function back?
Can you show me an example query you run ?
//http://wiki.apache.org/solr/FunctionQuery#idf
On Thu, Jun 9, 2011 at 9:23 AM, Jason Toy
the results
of the function back?
Can you show me an example query you run ?
//http://wiki.apache.org/solr/FunctionQuery#idf
On Thu, Jun 9, 2011 at 9:23 AM, Jason Toy jason...@gmail.com wrote:
I want to be able to run a query like idf(text, 'term') and have that
data returned with my
I want to be able to run a query like idf(text, 'term') and have that data
returned with my search results. I've searched the docs,but I'm unable to
find how to do it. Is this possible and how can I do that ?
. For that reason I believe the bug is in solr and not in
lucene.
Jason Toy
socmetrics
http://socmetrics.com
@jtoy
49 matches
Mail list logo