Has anybody done personalized search with Solr? I'm thinking of including
fields such as bought or like per member/visitor via dynamic fields to a
product search schema. Another option is to have a multi-value field that
can contain user IDs. What are the possible performance issues with this
Hi All
I am getting Error in Solr
Error loading class 'Solr.TrieField'
I have added following in Types of schema file
fieldType name=tint class=solr.TrieField omitNorms=true /
And in custom fields of schema have added
field name=bitrate type=tint indexed=true stored=true /
I am using
Hi Rih,
You going to include either of the two field bought or like to per
member/visitor OR a unique field per member / visitor?
If it's one or two common fields are included then there will not be any
impact in performance. If you want to include unique field then you need to
consider multi
Another approach would be to do query time boosts of 'my' items under
the assumption that count is limited:
- keep the SOLR index independent of bought/like
- have a db table with user prefs on a per item basis
- at query time, specify boosts for 'my items' items
We are planning to do this in the
I am getting Error in Solr
Error loading class 'Solr.TrieField'
I have added following in Types of schema file
fieldType name=tint class=solr.TrieField
omitNorms=true /
And in custom fields of schema have added
field name=bitrate type=tint indexed=true
stored=true /
I am
Hey Ahmet
I have added
field name=bitrate type=sint indexed=true stored=true default=0/
And the request I am passing is
/solr/select?indent=onversion=2.2q=rockfq={!field%20f=content}mp3fq:bitrate:[*
TO 127] start=0rows=10fl=*%2Cscoreqt=dismaxwt=standardexplainOther=hl.fl
Still I am seeing
And the request I am passing is
/solr/select?indent=onversion=2.2q=rockfq={!field%20f=content}mp3fq:bitrate:[*
TO 127]
start=0rows=10fl=*%2Cscoreqt=dismaxwt=standardexplainOther=hl.fl
Still I am seeing documents above bitarate 127
There is a typo instead of fq: there should be fq=
Oops my bad,
Thanks much
-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com]
Sent: Thursday, May 20, 2010 4:31 PM
To: solr-user@lucene.apache.org
Subject: RE: how to achieve filters
And the request I am passing is
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Are the Solr 1.4 statistics like #docs, #docsPending etc. exposed in
JSON format?
Andreas
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
Hi,
Is it possible to use solr caches such as query cache , filter cache
and document cache from external caching system like memcached as it
has several advantages such as centralized caching system and reducing the
pause time of JVM 's garbage collection as we can assign less
Hi All,
The problem is resolved. It is purely due to filesystem. My filesystem is
of 32-bit, running on 64 bit OS. I changed to 64 bit filesystem and all
works as expected.
Uma
--
View this message in context:
http://lucene.472066.n3.nabble.com/index-merge-tp472904p832053.html
Sent from
Hi dc,
- at query time, specify boosts for 'my items' items
Do you mean something like document-boost or do you want to include
something like
OR myItemId:100^100
?
Can you tell us how you would specify document-boostings at query-time? Or
are you querying something like a boolean field
On May 19, 2010, at 11:43pm, Rih wrote:
Has anybody done personalized search with Solr? I'm thinking of
including
fields such as bought or like per member/visitor via dynamic
fields to a
product search schema. Another option is to have a multi-value field
that
can contain user IDs. What
Hi.
I have a question about how I can get solr to index quicker then it does
at the moment.
I have to index (and re-index) some 3-5 million documents. These
documents are preprocessed by a java application that effectively
combines multiple database tables with each-other to form the
Hey everyone,
I've recently been given a requirement that is giving me some trouble. I need
to retrieve up to 100 documents, but I can't see a way to do it without making
100 different queries.
My schema has a multi-valued field like 'listOfIds'. Each document has between
0 and N of these ids
How about throwing a blockingqueue,
http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/BlockingQueue.html,
between your document-creator and solrserver? Give it a size of 10,000 or
something, with one thread trying to feed it, and one thread waiting for it to
get near full then
I have a indexed_timestamp field in my index - which lets me know when
document was indexed:
field name=indexed_timestamp type=date indexed=true stored=true
default=NOW multiValued=false/
For some reason when doing delta indexing via DIH, this field is not being
updated.
Are timestamp
It takes that long to do indexing? I'm HOPING to have a site that has low 10's
of millions of documents to billions.
Sounds to me like I will DEFINITELY need a cloud account at indexing time. For
the original author of this thread, that's what I'd recommend.
1/ Optimize as best as you can on
Hi guys/gals,
I am using apache-solr-1.4.0.war deployed to glassfishv3 on my development
machine which is Ubuntu 9.10 64-bit. I am using Solrj 1.4 using the
CommonsHttpSolrServer connection to that Solr instance
(http://localhost:8080/apache-solr-1.4.0) during my development. To simplify
I already have a blockingqueue in place (that's my custom queue) and
luckily I'm indexing faster then what your doing.Currently it takes
about 2hour to index the 5m documents I'm talking about. But I still
feel as if my machine is under utilized.
Thijs
On 20-5-2010 17:16, Nagelberg, Kallin
Why would I need faster hardware if my current hardware isn't reaching
it's max capacity?
I'm already using a different machine for querying and indexing so while
indexing the queries aren't affected. Pulling an optimized snapshot
isn't even noticeable on the query-machines.
Thijs
On
Well to be fair I'm indexing on a modest virtualized machine with only 2 gigs
ram, and a doc size of 5-10k maybe substantially larger than what you have.
They could be substantially smaller too. As another point of reference my index
ends up being about 20Gigs with the 5 million docs.
I
You're sure it's not blocking on indexing IO? If not then I guess it must be a
thread waiting unnecessarily in solr or your loading program. To get my loader
running at full speed I hooked it up to jprofiler's thread views to see where
the stalls were and optimized from there.
-Kallin
Here is a good article from IBM, with code, on how to do hybrid/cloud computing.
http://www.ibm.com/developerworks/library/x-cloudpt1/
Dennis Gearon
Signature Warning
EARTH has a Right To Life,
otherwise we all die.
Read 'Hot, Flat, and Crowded'
Laugh at
In my SolrJ using application, I have a
test case which queries for “numéro” and
succeeds if I use Embedded and fails if I use CommonsHttpSolrServer… I
don’t want to use embedded for a number of reasons including that its not
recommended (http://wiki.apache.org/solr/EmbeddedSolr)
I am sorry
Ok. I think I understand. What's impossible about this?
If you have a single field name called id that is multivalued
then you can retrieved the documents with something like:
id:1 OR id:2 OR id:56 ... id:100
then add limit 100.
There's probably a more succinct way to do this, but I'll leave
Thanks Darren,
The problem with that is that it may not return one document per id, which is
what I need. IE, I could give 100 ids in that OR query and retrieve 100
documents, all containing just 1 of the IDs.
-Kallin Nagelberg
-Original Message-
From: dar...@ontrenet.com
I see. Well, now you're asking Solr to ignore its prime directive of
returning hits that match a query. Hehe.
I'm not sure if Solr has a unique attribute.
But this sounds, to me, like you will have to filter the results yourself.
But at least you hit Solr only once before doing so.
Good luck!
I had had the same issue within tomcat, further to what Ahmet wrote I
recommend to plug a filter in your solr context that forces responses and
requests to be encodded in UTF8
On Thu, May 20, 2010 at 5:11 PM, Ahmet Arslan iori...@yahoo.com wrote:
In my SolrJ using application, I have a
test
Would each Id need to return a different doc?
If not:
you could probably use FieldCollapsing:
http://wiki.apache.org/solr/FieldCollapsing
http://wiki.apache.org/solr/FieldCollapsingi.e: - collapse on listOfIds
(see wiki entry for syntax)
- constrain the field to only return the id's you
Yeah I need something like:
(id:1 and maxhits:1) OR (id:2 and maxits:1).. something crazy like that..
I'm not sure how I can hit solr once. If I do try and do them all in one big OR
query then I'm probably not going to get a hit for each ID. I would need to
request probably 1000 documents to
The problem here, I think, is that you only want 1 of many _results_ for a
particular ID. How would Solr know which result you want to keep? And
which to throw away?
However...
You can do this in two queries if you want. Have a separate solr document
with unique ID equal to the listOfIds value
Hi Kallin,
again please look at
FieldCollapsinghttp://wiki.apache.org/solr/FieldCollapsing ,
that should do the trick.
basically: first you constrain the field: 'listOfIds' to only contain docs
that contain any of the (up to) 100 random ids as you know how to do
Next, in the same query, specify
Hi all!
I'm trying to do some simple highlighting, but I cannot seem to figure out how
to make it work.
I'm using my own QueryParser which generates custom made queries. I would like
Solr to be able to highlight them.
I've tried many options in the highlighter but cannot get any snippets to
Thanks, I'm going to take a look at fieldcollapsingquery as it seems like it
should do the trick!
-Kallin Nagelberg
-Original Message-
From: Geert-Jan Brits [mailto:gbr...@gmail.com]
Sent: Thursday, May 20, 2010 1:03 PM
To: solr-user@lucene.apache.org
Subject: Re: seemingly impossible
Hi All,
How can I see all of the queries sent to my DB during a Delta Import?
It seems like my documents are not being updated via delta import
When I use SOLR's DataIMport Handler Console - with delta-import selected I see
lst name=entity:getall
lst name=document#1/
/lst
−
lst
: I am using apache-solr-1.4.0.war deployed to glassfishv3 on my
...
: INFO: [] webapp=/apache-solr-1.4.0 path=/select
:
params={indent=onversion=2.2q=numérofq=start=0rows=10fl=*,scoreqt=standardwt=standardexplainOther=hl.fl=}
: hits=0 status=0 QTime=16
...
: In my SolrJ
I'm really only guessing here, but based on your description of what you
are doing it sounds like you only have one thread streaming documents to
solr (via a single StreamingUpdateSolrServer instance which creates a
single HTTP connection)
Have you at all attempted to have parallel threads in
Chris,
You are the best. Switching to POST solved the problem. I hadn't noticed that
option earlier but after finding:
https://issues.apache.org/jira/browse/SOLR-612 I found the option in the code.
Thank you, you just made my day.
Secondly, in an effort to narrow down whether this was a
StreamingUpdateSolrServer already has multiple threads and uses multiple
connections under the covers. At least the api says ' Uses an internal
MultiThreadedHttpConnectionManager to manage http connections'. The constructor
allows you to specify the number of threads used,
: Starting with glassfishv3 (I think) UTF-8 is the default for URI. You
: can see this by going to the admin site, clicking on Network Config |
: Network Listeners | then select the listener. Select the tab HTTP and
: about half way down, you will see URI Encoding: UTF-8.
:
: HOWEVER, that
: I am trying to subclass DIH to add I am having a hard time trying to get
: access to the current Solr Context. How is this possible?
I don't think DIH was particularly designed to be subclassed (i'm suprised
it's not final) ... it was built with the assumption that people would
write
: StreamingUpdateSolrServer already has multiple threads and uses multiple
: connections under the covers. At least the api says ' Uses an internal
Hmmm... i think one of us missunderstands the point behind
StreamingUpdateSolrServer and it's internal threads/queues. (it's very
possible that
Yeah this looks perfect. Too bad it's not in 1.4, I guess I can build from
trunk and patch it. This is probably a stupid question but is there any feeling
as to when 1.5 might come out?
Thanks,
-Kallin Nagelberg
-Original Message-
From: Geert-Jan Brits [mailto:gbr...@gmail.com]
Sent:
Ok to further explain myself.
Well first off I was experience a StackOverFlow error during my
delta-imports after doing a full-import. The strange thing was, it only
happened sometimes. Thread is here:
I wanted to improve the documentation in the solr wiki by adding in my
findings. However, when I try to log in and create a new account, I
receive this error message:
You are not allowed to do newaccount on this page. Login and try again.
Does anyone know how I can get permission to add a page
Actually, its not as much a Solr problem as a Lucene one, as it turns out, the
WeightedSpanTermExtractor is in Lucene and not Solr.
Why they decided to only highlight queries that are in Lucene I don't know, but
what I did to solve this problem was simply to make my queries extends a Lucene
First of all, I'd like to apologize in advance for being a pretty raw newbie
when it comes to search technologies, so please bear with me!
The situation:
My company has a system that moderates 15 character free form text fields.
We have a dictionary of words in our database that are banned due
I know this post is old but did you ever get a resolution to this problem? I
am running into the exact same issue. I even switched my id from text to
string and reindexed as that was the last suggestion and still no
resolution.
--Tony
--
View this message in context:
So are we the only ones who never got sharding working with multi-cores?
Bummer... Hopefully someone else will chime in with an answer.
--Tony
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-Shard-Strange-results-tp496373p832863.html
Sent from the Solr - User mailing
Hello kkieser.
I've used both and my name may of come up in your searches. For your
system, I would definitely not use Endeca as its too complicated for the
relatively simple needs that you have. You asked if there are technical
differences and of course being two different systems, the
rant_by_HTTP_Verb_Nazi
Using POST totally violates the access model for an entity in the HTTP Verb
model.
Basically:
GET=READ
POST=CREATE
PUT=MODIFY
DELETE=(drum roll please)DELETE
Granted, the whole web uses POST for modify, but let's not make the situation
worse by using it for everything.
Thanks for your response David! At the moment we have over 40,000 words on
our banned list, and only recently added the white list, so we anticipate
this number to jump quite quickly. I've heard Solr can handle up to around 2
million records before slowing down so I'm not too worried about
kkieser,
It just occurred to me that Solr might actually fit the bill. Your scenario
is definitely not present a use of Solr that is typical at all, but a novel
use of Solr I am about to describe could totally get what you want.
A Solr index is composed of documents which are typically similar
http://wiki.apache.org/solr/SolrJmx#Remote_Connection_to_Solr_JMX
Ask the wiki!
On Wed, May 19, 2010 at 6:19 AM, Na_D nabam...@zaloni.com wrote:
Thanks for the info , using the above properties solved the issue .
--
View this message in context:
Hello Soir,
Soir looks like an excellent API and its nice to have a tutorial that makes it
easy to discover the basics of what Soir does, I'm impressed. I can see plenty
of potential uses of Soir/Lucene and I'm interested now in just how real-time
the queries made to an index can be?
For
Hi all,
We'd started using embedded Solr back in 2007, via a patched version
of the in-progress 1.3 code base.
I recently was reading http://wiki.apache.org/solr/EmbeddedSolr, and
wondered about the paragraph that said:
The simplest, safest, way to use Solr is via Solr's standard HTTP
Solr is a very good engine, but it is not real-time. You can turn off the
caches and reduce the delays, but it is fundamentally not real-time.
I work at MarkLogic, and we have a real-time transactional search engine (and
respository). If you are curious, contact me directly.
I do like Solr for
58 matches
Mail list logo