RE: Best way to anchor solr searches?

2011-08-25 Thread arian487
Thanks for the replies.  I did look at caching but our commit time time is 90
seconds.  It's definitely possible for someone to make a search, change the
page, and have wonky results.  How about getting it to autowarm the x most
recent searches in the queryResultCache and that can hopefully reduce the
issues?  Though even that can result in issues with the search being out of
date.  

Application cache per user would for sure solve such issues but I'd like to
avoid this if possible.  Definitely an interesting problem...

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Best-way-to-anchor-solr-searches-tp3282576p3284674.html
Sent from the Solr - User mailing list archive at Nabble.com.


Best way to anchor solr searches?

2011-08-24 Thread arian487
If I'm searching for users based on last login time, and I search once, then
go to the second page with a new offset, I could potentially see the same
users on page 2 if the index has changed.  What is the best way to anchor it
so I avoid this?  

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Best-way-to-anchor-solr-searches-tp3282576p3282576.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Problems generating war distribution using ant

2011-08-17 Thread arian487
Stupid me.  The output file was named something else.  I really need to make
a proper servlet mapping.  Works now :D

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problems-generating-war-distribution-using-ant-tp3260070p3260843.html
Sent from the Solr - User mailing list archive at Nabble.com.


Problems generating war distribution using ant

2011-08-16 Thread arian487
So the way I generate war files now is by running an 'ant dist' in the solr
folder.  It generates the war fine and I get a build success, and then I
deploy it to tomcat and once again the logs show it was successful (from the
looks of it).  However, when I go to 'myip:8080/solr/admin' I get an HTTP
status 404.

However, it works when I take a war from the nightly build, expand it, drop
some new class files in there that I need, and close it up again.  The solr
I have checked out seems fine though and I can't find any differences
between the war I'm generating and the one that has been generated.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problems-generating-war-distribution-using-ant-tp3260070p3260070.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Problems generating war distribution using ant

2011-08-16 Thread arian487
Interesting.  I can use this as an option and create a custom 'war' target if
need be but I'd like to avoid this.  I'd rather do a full build from the
source code I have checked out from the SVN.  Any reason why 'ant dist'
doesn't produce a good war file?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problems-generating-war-distribution-using-ant-tp3260070p3260122.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Problems generating war distribution using ant

2011-08-16 Thread arian487
Interesting.  I can use this as an option and create a custom 'war' target if
need be but I'd like to avoid this.  I'd rather do a full build from the
source code I have checked out from the SVN.  Any reason why 'ant dist'
doesn't produce a good war file?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problems-generating-war-distribution-using-ant-tp3260070p3260126.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Hudson build issues

2011-08-11 Thread arian487
I downloaded the official build (4.0) and I've been customizing it for my
needs.  I'm not really sure how to use these scripts.  Is there somewhere in
Hudson where I can apply these scripts or something?  

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Hudson-build-issues-tp3244563p3246645.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Cache replication

2011-08-10 Thread arian487
Thanks for the advice paul, but post processing is a must for me given the
nature of my application.  I haven't had problems yet though.  

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Cache-replication-tp3240708p3244202.html
Sent from the Solr - User mailing list archive at Nabble.com.


Hudson build issues

2011-08-10 Thread arian487
Whenever I try to build this on our hudson server it says it can't find
org.apache.lucene:lucene-xercesImpl:jar:4.0-SNAPSHOT.  Is the Apache repo
lacking this artifact?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Hudson-build-issues-tp3244563p3244563.html
Sent from the Solr - User mailing list archive at Nabble.com.


Cache replication

2011-08-09 Thread arian487
I'm wondering if the caches on all the slaves are replicated across (such as
queryResultCache).  That is to say, if I hit one of my slaves and cache a
result, and I make a search later and that search happens to hit a different
slave, will that first cached result be available for use?

This is pretty important because I'm going to have a lot of slaves and if
this isn't done, then I'd have a high chance of running a lot uncached
queries.  

Thanks :)

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Cache-replication-tp3240708p3240708.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Cache replication

2011-08-09 Thread arian487
Thanks for the informative response.  I'll consider using the 'sticky'
addressing as you suggested.  The reason cache is so important for me is
because I'm actually doing more processing after the query component to come
up with my query result and I want to avoid that processing as much as
possible.  But thanks alot!


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Cache-replication-tp3240708p3240853.html
Sent from the Solr - User mailing list archive at Nabble.com.


Any way to get the value if sorting by function?

2011-07-07 Thread arian487
Lets say my sort is something like:

sort=sum(indexedField, constant).  If I have a component that runs right
after the QueryComponent, is it possible to know what this value was for
each of the documents IF the field is not stored, and only indexed?  I
scoured through the code and it didn't look like this was possible.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Any-way-to-get-the-value-if-sorting-by-function-tp3148864p3148864.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: After the query component has the results, can I do more filtering on them?

2011-07-05 Thread arian487
Sorry for being vague.  Okay so these scores exist on an external server and
they change often enough.  The score for each returned user is actually
dependent on the user doing the searching (if I'm making the request, and
you make the same request, the scores are different).  So what I'm doing is
getting a bunch of scores from the external and aggregating that with the
current scores solr gave in my component.  So heres the flow (all numbers
are arbitrary):

1) Get 10,000 results from solr from the query component
2) return a list of scores and ids from the external server (it'll return a
lot of them)
3) Out of this 1, I take the top 3500 docs after aggregating the
external servers scores and netcons scores.  

The problem is, the score for each doc is specific to the user making the
request.  The algorithm in doing these scores is quite complex.  I cannot
simply re-index with new scores, hence I've written this component which
runs after querycomponent and does the magic of filtering.  

I've come up with a solution but it involved me changing a lot of solr code. 
First and foremost, I've maed the queryResultCache public and developed a
small API in accessing and changing it.  I've also changed the
QueryResultKey to include a Long userId in its hashCode and equals
functions.  When a search is made, the QueryComponent caches its results,
and then in my custom component I go into that cache, get my superset,
filter it out from the scores in my external server, and throw it back into
cache.  Of course none of this happens if my custom scored stuff is already
cached, so its actually decent.  

If you have any suggestions and improvements I'd greatly appreciate it. 
Sorry for the long response...I didn't want to be an XY problem again :D

--
View this message in context: 
http://lucene.472066.n3.nabble.com/After-the-query-component-has-the-results-can-I-do-more-filtering-on-them-tp3114775p3141652.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Custom Cache cleared after a commit?

2011-07-05 Thread arian487
Sorry for my ignorance, but do you have any lead in the code on where to look
for this?  Also, I'd still need a way of finding out how long its been in
the cache because I don't want it to regenerate every time.  I'd want it to
regenerate only if its been in the cache for less then 6 hours (or some time
frame which I deem to be good).  Thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Custom-Cache-cleared-after-a-commit-tp3136345p3141673.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: what s the optimum size of SOLR indexes

2011-07-05 Thread arian487
It depends on how many queries you'd be making per second.  I know for us, I
have a gradient of index sizes.  The first machine, which gets hit most
often is about 2.5 gigs.  Most of the queries would only ever need to hit
this index but then I have a bigger indices of about 5-10 gigs each which
are slower, but don't get queried as often so I can afford them to be a
little slower (and hence the bigger index)

--
View this message in context: 
http://lucene.472066.n3.nabble.com/what-s-the-optimum-size-of-SOLR-indexes-tp3137314p3142309.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Custom Cache cleared after a commit?

2011-07-04 Thread arian487
I guess I'll have to use something other then SolrCache to get what I want
then.  Or I could use SolrCache and just change the code (I've already done
so much of this anwyways...).  Anyways thanks for the reply.  

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Custom-Cache-cleared-after-a-commit-tp3136345p3136580.html
Sent from the Solr - User mailing list archive at Nabble.com.


Custom Cache cleared after a commit?

2011-07-03 Thread arian487
I know the queryResultCache and stuff live only so long as a commit happens
but I'm wondering if the custom caches are like this as well?  I'd actually
rather have a custom cache which is not cleared at all.  I want to give the
elements of this Cache a 6 hour TTL (or some time frame) but I never want it
to clear on a commit.  Is this possible using SolrCache?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Custom-Cache-cleared-after-a-commit-tp3136345p3136345.html
Sent from the Solr - User mailing list archive at Nabble.com.


QueryResultCache question

2011-07-01 Thread arian487
So it seems the things in the queryResultCache have no TTL, I'm just curious
how it works if I reindex something with new info?  I am going to be
reindexing things often (I'd sort by last login and this changes fast). 
I've been stepping through the code and of course if the same queries come
in it simply gets the results from the key in the result cache.  However, if
I make the same query over and over again, when will I ever get different
results?  

I'm a little confused as to how the 'correct' results are shown if it just
uses the QueryResultKey to get the results from the cache.  I imagine a new
Searcher with a fresh cache is created or something with every index?  If
I'm reindexing very often, how useful is the QueryResultCache?  

--
View this message in context: 
http://lucene.472066.n3.nabble.com/QueryResultCache-question-tp3130135p3130135.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: QueryResultCache question

2011-07-01 Thread arian487
Thanks for the quick reply!  I see theres no way to access the result cache,
I actually want to access the result the cache in a new component I have
which runs after the query but it seems this is impossible.  I guess I'm
just going to rebuild the code to make it public or something as I need the
result cache.  

--
View this message in context: 
http://lucene.472066.n3.nabble.com/QueryResultCache-question-tp3130135p3130603.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: After the query component has the results, can I do more filtering on them?

2011-06-30 Thread arian487
unfortunately the userIdsToScore updates very often.  I'd get more Ids almost
every single query (hence why I made the new component).  But I see the
problem of not being able to score the whole resultSet.  I'd actually need
to do this now that I think about it.  I want to get a whole whack of users
(lets say 10,000), score them using my system, and then 'remember' the top
3500 of these users in the result cache or something.  

How would I go about operating on the whole resultSet rather then just the
'rows' I set.  I wonder if I can set rows to be really large, score them in
the component, and then remember all of these results in the result cache
and then dynamically change rows in my component so not all 3500 (or w/e
number I choose) are returned.  

--
View this message in context: 
http://lucene.472066.n3.nabble.com/After-the-query-component-has-the-results-can-I-do-more-filtering-on-them-tp3114775p3127560.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: After the query component has the results, can I do more filtering on them?

2011-06-30 Thread arian487
Sorry for the double post but in this case, is it possible for me to access
the queryResultCache in my component and play with it?  Ideally what I want
is this:

1) I have 1 (just a random large number) total results. 
2) In my component I access all of these results, score them, and take the
top 3500 (a random smaller number) and drop the rest.  
3) The 3500 I have now should end up going into the queryResultCache and
essentially replacing the other one.
4) The number returned to the user should then be rows and subsequent
queries which are the same just gets them from my new result cache.

I'm pretty noob about all if this so I'm hoping someone can help.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/After-the-query-component-has-the-results-can-I-do-more-filtering-on-them-tp3114775p3127581.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: After the query component has the results, can I do more filtering on them?

2011-06-29 Thread arian487
bump

--
View this message in context: 
http://lucene.472066.n3.nabble.com/After-the-query-component-has-the-results-can-I-do-more-filtering-on-them-tp3114775p3123502.html
Sent from the Solr - User mailing list archive at Nabble.com.


After the query component has the results, can I do more filtering on them?

2011-06-27 Thread arian487
So I made a custom search component which runs right after the query
component and this custom component will update the score of each based on
some things (and no, I definitely can't use existing components).  I didn't
see any easy way to just update the score so what I currently do is
something like this:

DocList docList = rb.getResults().docList;
float[] scores = new float[docList.size()];
int[] docs = new int[docList.size()];
int docCounter = 0;
int maxScore = 0;

while (docList.iterator().hasNext()) {
int userId = docList.iterator().nextDoc();
int score = userIdsToScore.get(userId);

scores[docCounter] = score;
docs[docCounter] = userId;
docCounter++;

if (maxScore  score) {
maxScore = score;
}
}
docList = new DocSlice(0, docCounter, docs, scores, 0, 
maxScore);

my userIdsToScore hashtable is how I'm determining the new score.  There are
a few other things I'm doing but this is the gist.  I'm also not sure how to
go about sorting this...but basically my question is, is this how I should
be updating the score of the documents?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/After-the-query-component-has-the-results-can-I-do-more-filtering-on-them-tp3114775p3114775.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Caching queries.

2011-06-20 Thread arian487
Thanks, this is exactly what I'm looking for!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Caching-queries-tp3078271p3087497.html
Sent from the Solr - User mailing list archive at Nabble.com.


Caching queries.

2011-06-17 Thread arian487
I'm wondering if something like this is possible.  Lets say I want to query
5000 objects all pertaining to a specific search and I want to return the
top 100 or something and cache the rest on my solr server.  The next time I
get the same query or something with a new offset (lets say start from 101)
does it have to do the query again or can it go to cache and get the next
100?  

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Caching-queries-tp3078271p3078271.html
Sent from the Solr - User mailing list archive at Nabble.com.


Is there a way to get all the hits and score them later?

2011-06-02 Thread arian487
Basically I don't want the hits and the scores at the same time.  I want to
get a list of hits but I want to score them myself externally (there is a
dedicated server that will do the scoring given a list of id's).  Thanks!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-there-a-way-to-get-all-the-hits-and-score-them-later-tp3016424p3016424.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Is there a way to get all the hits and score them later?

2011-06-02 Thread arian487
To clarify.  I want to do this all underneath solr.  I don't want to get a
bunch of hits from solr in my app and then go to my server and score them
again.  I'd like to score them myself underneath solr before I return the
results to my app.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-there-a-way-to-get-all-the-hits-and-score-them-later-tp3016424p3016592.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Is there a way to get all the hits and score them later?

2011-06-02 Thread arian487
Actually I was thinking I wanted to do something before the sharding (like in
the layer where faceting happens for example).  I wanna hack a plugin in the
middle to go to my server after I have a bunch of hits.  Just not sure where
to do this...

Though I've decided I can do scoring from solr (like a preliminary scoring
to narrow down some results) and then in the middle send those hits to my
server for additional scoring.  I can't hack it on in the end since the
sharding has happened I think, I'm just not sure where to look right now.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-there-a-way-to-get-all-the-hits-and-score-them-later-tp3016424p3017401.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Is there a way to get all the hits and score them later?

2011-06-02 Thread arian487
Hmm, looks like I can inherit the Similarity Class and do my own thing there. 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-there-a-way-to-get-all-the-hits-and-score-them-later-tp3016424p3018001.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Custom Scoring relying on another server.

2011-05-31 Thread arian487
bump

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Custom-Scoring-relying-on-another-server-tp2994546p3006873.html
Sent from the Solr - User mailing list archive at Nabble.com.


Custom Scoring relying on another server.

2011-05-27 Thread arian487
I know this question has been asked before but I think my situation is a
little different.  Basically I need to do custom scores that the traditional
function queries simply won't allow me to do.  I actually need to hit
another server from Java (passing in a bunch of things mostly relying on how
to score result).  So I want to extend the current scorer and add in the
things I need it to do for the scoring (make a trip to the scoring server
with a bunch of parameters, and come back with the scores).  

Can someone point me to the right direction to doing this?  Exactly where
does the document scoring happen in Solr?  Thanks!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Custom-Scoring-relying-on-another-server-tp2994546p2994546.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Field collapsing on multiple fields and/or ranges?

2011-05-18 Thread arian487
bump

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Field-collapsing-on-multiple-fields-and-or-ranges-tp2929793p2958029.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Field collapsing on multiple fields and/or ranges?

2011-05-18 Thread arian487
Thanks for the reply!  How exactly do I open an issue?  

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Field-collapsing-on-multiple-fields-and-or-ranges-tp2929793p2958277.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Field collapsing on multiple fields and/or ranges?

2011-05-18 Thread arian487
https://issues.apache.org/jira/browse/SOLR-2526

modules/grouping was not a valid component so I just put it in search. 
Thanks!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Field-collapsing-on-multiple-fields-and-or-ranges-tp2929793p2958408.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Field collapsing on multiple fields and/or ranges?

2011-05-18 Thread arian487
Ah, my mistake.  Thanks alot, this would be a really cool feature :)

For now I'm resorting to like making more then one query and cross
referencing the two separate queries.  

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Field-collapsing-on-multiple-fields-and-or-ranges-tp2929793p2959439.html
Sent from the Solr - User mailing list archive at Nabble.com.


Field collapsing on multiple fields and/or ranges?

2011-05-11 Thread arian487
I'm wondering if there is a way to get the field collapsing to collapse on
multiple things?  For example, is there a way to get it to collapse on a
field (lets say 'domain') but ALSO something else (maybe time or something)?

To visualize maybe something like this:

Group1 has common field 'www.forum1.com' and ALSO the posts are all from may
11
Group2 has common field 'www.forum2.com' and ALSO the posts are all from may
11
.
.
.
GroupX has common field 'www.forum1.com' and ALSO the posts from may 12

So obviously it's still sorted by date but it won't group the
'www.forum1.com' things together if the document is from a different date,
it'll group common date AND common domain field.  

Thanks!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Field-collapsing-on-multiple-fields-and-or-ranges-tp2929793p2929793.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: SolrQuery API for adding group filter

2011-05-10 Thread arian487
I'm actually using php but I get what you're saying.  I think I understand
what I need to do.  Thanks a lot man!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrQuery-API-for-adding-group-filter-tp2921539p2923701.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: SolrQuery API for adding group filter

2011-05-10 Thread arian487
I actually have another question unrelated to this (but related to grouping). 
I'm wondering if I can do a more complex grouping, such as grouping by a
field and also making sure it matches some other criteria (such as date). 
For example, currently it might group 5 items from some field, but the 5th
item for example is from a really far date which I don't want grouped with
these more recent items.  

Basically I want it to look like this:

Group1 all has common field 'x' and ALSO is items from today
Group2 all has common field 'x' again but now its items are from yesterday,
etc...

I'm having trouble figuring out how that'd work, any help would be
appreciated!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrQuery-API-for-adding-group-filter-tp2921539p2924232.html
Sent from the Solr - User mailing list archive at Nabble.com.


SolrQuery API for adding group filter

2011-05-09 Thread arian487
There doesn't seem to be API to add a group (like group.field or group=true). 
I'm very new to this so I'm wondering how I'd go about adding a group query
much like how I use 'addFilterQuery' to add an fq.  Thanks.  

--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrQuery-API-for-adding-group-filter-tp2921539p2921539.html
Sent from the Solr - User mailing list archive at Nabble.com.