Re: help implementing a couple of business rules

2010-01-12 Thread Aleksander Stensby
For your first question, wouldn't it be possible to achieve that with some
simple boolean logic? I mean, if you have a requirement to match any of the
other fields AND description2, but not if it ONLY matches description 2:

say matching x against field A, B, and description 2:
((A:x OR B:x) AND description2:x)
would only give you results from description2 IF there is also a match in
either one of the other two fields.

If I misunderstood your requirements, you should also note that solr
supports pure negative field matching aswell, meaning that you CAN exclude
results from a specific field entirely. From the wiki:

 Pure negative queries (all clauses prohibited) are allowed. 
 -inStock:falsefinds all field values where inStock is not false


Hope that helps,
 Aleks


On Mon, Jan 11, 2010 at 7:29 PM, Joe Calderon calderon@gmail.comwrote:

 thx, but im not sure that covers all edge cases, to clarify
 1. matching description2 is okay if other fields are matched too, but
 results matching only to description2 should be omitted

 2. its okay to not match against the people field, but matches against
 the people field should only be phrase matches

 sorry if  i was unclear

 --joe
 On Mon, Jan 11, 2010 at 10:13 AM, Erik Hatcher erik.hatc...@gmail.com
 wrote:
 
  On Jan 11, 2010, at 12:56 PM, Joe Calderon wrote:
 
  1. given a set of fields how to return matches that match across them
  but not just one specific one, ex im using a dismax parser currently
  but i want to exclude any results that only match against a field
  called 'description2'
 
  One way could be to add an fq parameter to the request:
 
fq=-description2:(query)
 
  2. given a set of fields how to return matches that match across them
  but on one specific field match as a phrase only, ex im using a dismax
  parser currently but i want matches against a field called 'people' to
  only match as a phrase
 
  Doesn't setting pf=people accomplish this?
 
 Erik
 
 



Re: Yankee's Solr integration

2010-01-12 Thread Aleksander Stensby
They have probably added the logic for that server-side. Solr does not
support these type of features, but they are easy to implement.

Saving a search could be as easy as storing the selected query parameters.
Then creating an alert (or RSS feed) for that would be a process on the
server that executes those stored queries agains solr at regular intervals,
and formats the results as either RSS or an email then ships that off to the
client that subscribed.

Cheers,
 Aleks


On Wed, Jan 6, 2010 at 3:12 PM, Nicolas Kern nico...@nicolaskern.fr wrote:

 Hello everybody,

 I was wordering how did Yankee (
 http://www.yankeegroup.com/search.do?searchType=advancedSearch) did to
 provide the possibility to Create Alerts, Save Searches, and generate a RSS
 Feed out of a custom search using Solr, do you have any idea ?

 Thanks a lot,
 Best regards  happy new year !
 Nicolas



Re: Facets and distributed search

2010-01-05 Thread Aleksander Stensby
Hi Yonik!

I've tried recreating the problem now to get some log-output and the problem
just doesn't seem to be there anymore... This puzzles me abit, as the
problem WAS definitely there before.
I've done one change and that is to optimize the index on one of the
servers. But should that impact this to such a significant extent?
The other thing I noticed was that I had set facet.mincount=0, which is
obviously stupid in this case and might just be the problem here.
Changing it to mincount=1 made all queries fast again:)

Sorry for the stupid inquiry, I'll be sure to check my tests two or three
times before posting similar issues again!

Cheers,
 Aleks


On Mon, Jan 4, 2010 at 5:26 PM, Yonik Seeley yo...@lucidimagination.comwrote:

 Something looks wrong... that type of slowdown is certainly not expected.
 You should be able to see both the main query and a sub-query in the
 logs... could you post an actual example?

 -Yonik
 http://www.lucidimagination.com


 On Mon, Jan 4, 2010 at 4:15 AM, Aleksander Stensby
 aleksander.sten...@integrasco.com wrote:
  Hi everyone! I've posted a similar question earlier, but in a thread
 related
  to facets in general, so I thought I'd repost it here as a separate
 thread.
 
  I have a faceted search that is very fast when I executed the query on a
  single solr server, but is significantly slower when executed in a
  distributed environment.
  The set-back seem to be in the sharding of our data.. And that puzzles me
 a
  little bit... I can't really see why SOLR is so slow at doing this.
 
  The scenario:
  Let's say we have two servers (s1 and s2).
  If i query
  the following:
 
 q=threadid:33facet=truefacet.field=authorlimit=-1facet.mincount=0rows=0
  directly on either server, the response is lightning fast. (10ms)
 
  So, in theory I could query them directly, concat the result myself and
 get
  that done pretty fast.
 
  But if I introduce the shards parameter, the response time booms to
 between
  15000ms and 2ms!
  shards=s1:8983/solr,s2:8983/solr
 
  My initial thoughts is that I MUST be doing something wrong here?
 
  So I try the following:
  Run the query on server s1, with the shards param shards=s1:8983/solr
  response time goes from sub 10ms to between 5000ms and 1ms!
  Same results if i run the query on s2, and same if i use
 shards=s2:8983/solr
 
  Is there really that much overhead in running a distributed facet field
  query with Solr? Anyone else experienced this?
 
  On the other hand, running regular queries without facet distributed is
  lightning fast... (so can't really see that this is a network problem or
  anything either). - I tried running a facet query on s1 with s1 as the
  shards param, and that is still as slow as if the shards param was
 pointed
  to a different server...
 
  Any insight into this would be greatly appreciated! (Would like to avoid
  having to hack together our own solution concatenating results...)
 
  Cheers,
   Aleks
 



Re: Optimize not having any effect on my index

2010-01-04 Thread Aleksander Stensby
Hey, I managed to run it correctly after a few restarts. Don't really know
what happened.
Can't really see what this would have had to do with compound file format
tho? But no, I'm not using compund file format.

Cheers and thanks for your replies,
 Aleks

On Mon, Dec 21, 2009 at 8:27 AM, gurudev suyalprav...@yahoo.com wrote:


 Hi,

 Are you using the compound file format? If yes, then, have u set it
 properly
 in solrconfig.xml, if not, then, change to:

 useCompoundFiletrue/useCompoundFile (this is by default 'false') under
 the tags:

 indexDefaults.../indexDefaults
  and, mainIndex.../mainIndex




 Aleksander Stensby wrote:
 
  Hey guys,
  I'm getting some strange behavior here, and I'm wondering if I'm doing
  anything wrong..
 
  I've got an unoptimized index, and I'm trying to run the following
  command:
 
 http://server:8983/solr/update?optimize=truemaxSegments=10waitFlush=false
  Tried it first directly in the browser, it obviously took quite a bit of
  time, but once it was finished I see no difference in my index. Same
  number
  of files, same size etc.
  So i tried with curl:
  curl http://server:8983/solr/update --data-binary 'optimize/' -H
  'Content-type:text/xml; charset=utf-8'
 
  No difference here either... Am I doing anything wrong? Do i need to
 issue
  a
  commit after the optimize?
 
  Any pointers would be greatly appreciated.
 
  Cheers,
   Aleks
 
 

 --
 View this message in context:
 http://old.nabble.com/Optimize-not-having-any-effect-on-my-index-tp26843094p26870653.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Facets and distributed search

2010-01-04 Thread Aleksander Stensby
Hi everyone! I've posted a similar question earlier, but in a thread related
to facets in general, so I thought I'd repost it here as a separate thread.

I have a faceted search that is very fast when I executed the query on a
single solr server, but is significantly slower when executed in a
distributed environment.
The set-back seem to be in the sharding of our data.. And that puzzles me a
little bit... I can't really see why SOLR is so slow at doing this.

The scenario:
Let's say we have two servers (s1 and s2).
If i query
the following:
q=threadid:33facet=truefacet.field=authorlimit=-1facet.mincount=0rows=0
directly on either server, the response is lightning fast. (10ms)

So, in theory I could query them directly, concat the result myself and get
that done pretty fast.

But if I introduce the shards parameter, the response time booms to between
15000ms and 2ms!
shards=s1:8983/solr,s2:8983/solr

My initial thoughts is that I MUST be doing something wrong here?

So I try the following:
Run the query on server s1, with the shards param shards=s1:8983/solr
response time goes from sub 10ms to between 5000ms and 1ms!
Same results if i run the query on s2, and same if i use shards=s2:8983/solr

Is there really that much overhead in running a distributed facet field
query with Solr? Anyone else experienced this?

On the other hand, running regular queries without facet distributed is
lightning fast... (so can't really see that this is a network problem or
anything either). - I tried running a facet query on s1 with s1 as the
shards param, and that is still as slow as if the shards param was pointed
to a different server...

Any insight into this would be greatly appreciated! (Would like to avoid
having to hack together our own solution concatenating results...)

Cheers,
 Aleks


Optimize not having any effect on my index

2009-12-18 Thread Aleksander Stensby
Hey guys,
I'm getting some strange behavior here, and I'm wondering if I'm doing
anything wrong..

I've got an unoptimized index, and I'm trying to run the following command:
http://server:8983/solr/update?optimize=truemaxSegments=10waitFlush=false
Tried it first directly in the browser, it obviously took quite a bit of
time, but once it was finished I see no difference in my index. Same number
of files, same size etc.
So i tried with curl:
curl http://server:8983/solr/update --data-binary 'optimize/' -H
'Content-type:text/xml; charset=utf-8'

No difference here either... Am I doing anything wrong? Do i need to issue a
commit after the optimize?

Any pointers would be greatly appreciated.

Cheers,
 Aleks


Re: Can solr do the equivalent of select distinct(field)?

2009-12-17 Thread Aleksander Stensby
A follow up question on this Hoss:
If I have a set of documents, let's say this email thread. Each email has a
unique author. All emails in the thread are indexed with threadid=33 If I
want to count the number of unique authors in this email thread, I could go
along the lines you mention at the end:
rows=0threadid=33facet=truefacet.field=authorlimit=-1
then count all returned facets. This works, but becomes unfeasable when the
number of unique author values in the index is large. Right?
So the limit=-1 solution is just not working for such fields. But would work
well for category if the number of unique categories is low...
It's almost faster to retrieve all entries from the thread and count
programatically the number of unique authors... But obviouslly, I don't want
to do that!

So, how would you go about to find the number of unique authors in this
scenario?

Cheers,
 Aleks

On Wed, Sep 2, 2009 at 12:57 AM, Chris Hostetter
hossman_luc...@fucit.orgwrote:


 : lets say you filter your query on something and want to know how many
 : distinct categories that your results comprise.
 : then you can facet on the category field and count the number of facet
 : values that are returned, right?

 if you count the number of facet values returned you are getting a count
 of disctinct values

 if you just want the list of distinct values in a field (for your whole
 index) there TermsComponent is the fastest way.

 if you want the list of distinct values across a set of documents, then
 facet on that field when doing your query.

 select distinct category from books where bookInStock='true' is analgous
 to looking at the facet section of...

   rows=0q=bookInStock:truefacet=truefacet.field=category


 -Hoss




Re: Can solr do the equivalent of select distinct(field)?

2009-12-17 Thread Aleksander Stensby
Forgot to add facet.mincount=1, obviously. But still, is this the only or
prefered way of doing something along these lines? Or is there a different
(better) approach?

Best regards,
 Aleksander

On Thu, Dec 17, 2009 at 5:59 PM, Aleksander Stensby 
aleksander.sten...@integrasco.com wrote:

 A follow up question on this Hoss:
 If I have a set of documents, let's say this email thread. Each email has a
 unique author. All emails in the thread are indexed with threadid=33 If I
 want to count the number of unique authors in this email thread, I could go
 along the lines you mention at the end:
 rows=0threadid=33facet=truefacet.field=authorlimit=-1
 then count all returned facets. This works, but becomes unfeasable when the
 number of unique author values in the index is large. Right?
 So the limit=-1 solution is just not working for such fields. But would
 work well for category if the number of unique categories is low...
 It's almost faster to retrieve all entries from the thread and count
 programatically the number of unique authors... But obviouslly, I don't want
 to do that!

 So, how would you go about to find the number of unique authors in this
 scenario?

 Cheers,
  Aleks


 On Wed, Sep 2, 2009 at 12:57 AM, Chris Hostetter hossman_luc...@fucit.org
  wrote:


 : lets say you filter your query on something and want to know how many
 : distinct categories that your results comprise.
 : then you can facet on the category field and count the number of facet
 : values that are returned, right?

 if you count the number of facet values returned you are getting a count
 of disctinct values

 if you just want the list of distinct values in a field (for your whole
 index) there TermsComponent is the fastest way.

 if you want the list of distinct values across a set of documents, then
 facet on that field when doing your query.

 select distinct category from books where bookInStock='true' is analgous
 to looking at the facet section of...

   rows=0q=bookInStock:truefacet=truefacet.field=category


 -Hoss





Re: Can solr do the equivalent of select distinct(field)?

2009-12-17 Thread Aleksander Stensby
Thanks for your reply Erik!

The speed of my suggested query is actually very fast once we add the
facet.mincount=1 (when searching within a limited set of documents).
The set-back seem to be in the sharding of our data.. And that puzzles me a
little bit...

I can't really see why SOLR is so slow at doing this.
The scenario:

Let's say we have two servers (s1 and s2).
If i query
the following:
q=threadid:33facet=truefacet.field=authorlimit=-1facet.mincount=0rows=0
directly on either server, the response is lightning fast. (10ms)
So, in theory I could query them directly, concat the result myself and get
that done pretty fast.
But if I introduce the shards parameter, the response time booms to between
15000ms and 2ms!
shards=s1:8983/solr,s2:8983/solr
My initial thoughts is that I MUST be doing something wrong here?

So I try the following:
Run the query on server s1, with the shards param shards=s1:8983/solr
response time goes from sub 10ms to between 5000ms and 1ms!
Same results if i run the query on s2, and same if i use shards=s2:8983/solr

Is there really that much overhead in running a distributed facet field
query with Solr? Anyone else experienced this?

On the other hand, running regular queries without facet distributed is
lightning fast... (so can't really see that this is a network problem or
anything either). - and I can't possibly be as I tried running a facet query
on s1 with s1 as the shards param, and that is still as slow as if the
shards param was pointed to a different server...

Any insight into this would be greatly appreciated! (Would like to avoid
having to hack together our own solution concatinating results...)

Cheers,
 Aleks


On Thu, Dec 17, 2009 at 7:36 PM, Erik Hatcher erik.hatc...@gmail.comwrote:


 On Dec 17, 2009, at 11:59 AM, Aleksander Stensby wrote:

 A follow up question on this Hoss:
 If I have a set of documents, let's say this email thread. Each email has
 a
 unique author. All emails in the thread are indexed with threadid=33 If
 I
 want to count the number of unique authors in this email thread, I could
 go
 along the lines you mention at the end:
 rows=0threadid=33facet=truefacet.field=authorlimit=-1
 then count all returned facets. This works, but becomes unfeasable when
 the
 number of unique author values in the index is large. Right?
 So the limit=-1 solution is just not working for such fields. But would
 work
 well for category if the number of unique categories is low...
 It's almost faster to retrieve all entries from the thread and count
 programatically the number of unique authors... But obviouslly, I don't
 want
 to do that!

 So, how would you go about to find the number of unique authors in this
 scenario?


 One possible solution is tree faceting:
 https://issues.apache.org/jira/browse/SOLR-792

facet.tree=threadid,author

 Could be a LARGE amount of data though!

Erik




Sorting on primitive types

2009-09-21 Thread Aleksander Stensby
Hey,
I have a question regarding the primitive type definitions and use of those
for sorting.

I have an ID field in my index of type SortableLongField, and on my test
index I have about 2 million documents. When doing a sort=id desc and q=*:*
I'm getting out of memory (heap space)...
running the instance with 2GB of memory so I wouldn't really think that
there should be any big problems here.

So I'm wondering if the Trie based field types are less memory expensive
than the old SortableXXFields?
sorting on the date field (which is a TrieDateField) works fine (and
fast)...

Any input is highly appreciated!

Cheers,
 Aleksander


-- 
Aleksander M. Stensby
Lead Software Developer and System Architect
Integrasco A/S
www.integrasco.com
http://twitter.com/Integrasco
http://facebook.com/Integrasco

Please consider the environment before printing all or any of this e-mail


Re: Sorting on primitive types

2009-09-21 Thread Aleksander Stensby
Perfect, thanks a heap Yonik!

Cheers,
 Aleks

On Mon, Sep 21, 2009 at 3:47 PM, Yonik Seeley yo...@lucidimagination.comwrote:

 On Mon, Sep 21, 2009 at 3:30 AM, Aleksander Stensby
 aleksander.sten...@integrasco.com wrote:
  So I'm wondering if the Trie based field types are less memory expensive
  than the old SortableXXFields?
  sorting on the date field (which is a TrieDateField) works fine (and
  fast)...

 In general, yes (assuming there are many unique values - your ID field
 would qualify).  SortableXXFields used the StringIndex (the only
 option in the past)... Trie* fields FieldCache entry use long[maxDoc]
 for TrieLong and TrieDate.

 -Yonik
 http://www.lucidimagination.com




-- 
Aleksander M. Stensby
Lead Software Developer and System Architect
Integrasco A/S
E-mail: aleksander.sten...@integrasco.com
Tel.: +47 41 22 82 72
www.integrasco.com
http://twitter.com/Integrasco
http://facebook.com/Integrasco

Please consider the environment before printing all or any of this e-mail


Re: Trie Date question

2009-08-28 Thread Aleksander Stensby
Thanks for the reply Yonik!
I'm using the nightly from 2009-08-20, so its a rather fresh build. And by
comparing the schema with the one im using now I had made a mistake when
defining the field.
By examining the most recent build, i noticed that the normal date field is
defined as follows:
fieldType name=date class=solr.TrieDateField omitNorms=true
precisionStep=0 positionIncrementGap=0/
(its actually a TrieDateField? does this mean that we are moving away from
the standard SolrDateField ?)
and that the tdate is specified as follows:
fieldType name=tdate class=solr.TrieDateField omitNorms=true
precisionStep=6 positionIncrementGap=0/
I'll update my schema definitions and reindex:) Guess that pretty much will
solve my problems.
Thanks!
 Aleks

On Thu, Aug 27, 2009 at 3:47 PM, Yonik Seeley yo...@lucidimagination.comwrote:

 I can't reproduce any problem.

 Are you using a recent nightly build?
 See the example schema of a recent nightly build for the correct way
 to define a Trie based field - the article / blog may be out of date.

 Here's what I used to test the example data:

 http://localhost:8983/solr/select?q=manufacturedate_dt:[NOW/DAY-4YEAR%20TO%20NOW/DAY]

 -Yonik
 http://www.lucidimagination.com



 On Thu, Aug 27, 2009 at 3:49 AM, Aleksander
 Stensbyaleksander.sten...@integrasco.com wrote:
  Hello everyone,
  after reading Grant's article about TrieRange capabilities on the lucid
 blog
  I did some experimenting, but I have some trouble with the tdate type and
 I
  was hoping that you guys could point me in the right direction.
  So, basically I index a regular solr date field and use that for sorting
 and
  range queries today. For experimenting I added tdate field, indexing it
 with
  the same data as in my other date field, but I'm obviously doing
 something
  wrong here, because the results coming back are completely different...
  the definitions in my schema:
  field name=datetime type=date indexed=true stored=false
  omitNorms=true/
  field name=tdatetime type=tdate indexed=true stored=false/
 
  so if I do a query on my test index:
  q=datetime:[NOW/DAY-1YEAR TO NOW/DAY]
  i get numFound=1031524 (don't worry about the ordering yet)..
  then, if I do the following on my trie date field:
  q=tdatetime:[NOW/DAY-1YEAR TO NOW/DAY]
  i get numFound=0
  Where did I go wrong? (And yes, both fields are indexed with the exactly
  same data...)
  Thanks for any guidance here!
  Cheers,
   Aleks
 
  --
  Aleksander M. Stensby
  Lead Software Developer and System Architect
  Integrasco A/S
  www.integrasco.com
  http://twitter.com/Integrasco
  http://facebook.com/Integrasco
 
  Please consider the environment before printing all or any of this e-mail
 




-- 
Aleksander M. Stensby
Lead Software Developer and System Architect
Integrasco A/S
www.integrasco.com
http://twitter.com/Integrasco
http://facebook.com/Integrasco

Please consider the environment before printing all or any of this e-mail


Re: Trie Date question

2009-08-28 Thread Aleksander Stensby
Hmm, seems I was one day too early with my nightly then:p
Quote from Chris (2009-08-20 17:04):
i changed it to be manufacturedate_dt since that fits with the existing
scheme ... the data is all made up, but so is all hte rest of our data.

seems like lucene.apache.org is down at the moment but will try out the new
example data once its back up again then, because even though I changed my
schema definitions, the two fields still gives back different results... :(
I'll keep you updated.
- Aleks
On Fri, Aug 28, 2009 at 9:33 AM, Aleksander Stensby 
aleksander.sten...@integrasco.com wrote:

 Thanks for the reply Yonik!
 I'm using the nightly from 2009-08-20, so its a rather fresh build. And by
 comparing the schema with the one im using now I had made a mistake when
 defining the field.
 By examining the most recent build, i noticed that the normal date field is
 defined as follows:
 fieldType name=date class=solr.TrieDateField omitNorms=true
 precisionStep=0 positionIncrementGap=0/
 (its actually a TrieDateField? does this mean that we are moving away from
 the standard SolrDateField ?)
 and that the tdate is specified as follows:
 fieldType name=tdate class=solr.TrieDateField omitNorms=true
 precisionStep=6 positionIncrementGap=0/
 I'll update my schema definitions and reindex:) Guess that pretty much will
 solve my problems.
 Thanks!
  Aleks

 On Thu, Aug 27, 2009 at 3:47 PM, Yonik Seeley 
 yo...@lucidimagination.comwrote:

 I can't reproduce any problem.

 Are you using a recent nightly build?
 See the example schema of a recent nightly build for the correct way
 to define a Trie based field - the article / blog may be out of date.

 Here's what I used to test the example data:

 http://localhost:8983/solr/select?q=manufacturedate_dt:[NOW/DAY-4YEAR%20TO%20NOW/DAY]

 -Yonik
 http://www.lucidimagination.com



 On Thu, Aug 27, 2009 at 3:49 AM, Aleksander
 Stensbyaleksander.sten...@integrasco.com wrote:
  Hello everyone,
  after reading Grant's article about TrieRange capabilities on the lucid
 blog
  I did some experimenting, but I have some trouble with the tdate type
 and I
  was hoping that you guys could point me in the right direction.
  So, basically I index a regular solr date field and use that for sorting
 and
  range queries today. For experimenting I added tdate field, indexing it
 with
  the same data as in my other date field, but I'm obviously doing
 something
  wrong here, because the results coming back are completely different...
  the definitions in my schema:
  field name=datetime type=date indexed=true stored=false
  omitNorms=true/
  field name=tdatetime type=tdate indexed=true stored=false/
 
  so if I do a query on my test index:
  q=datetime:[NOW/DAY-1YEAR TO NOW/DAY]
  i get numFound=1031524 (don't worry about the ordering yet)..
  then, if I do the following on my trie date field:
  q=tdatetime:[NOW/DAY-1YEAR TO NOW/DAY]
  i get numFound=0
  Where did I go wrong? (And yes, both fields are indexed with the exactly
  same data...)
  Thanks for any guidance here!
  Cheers,
   Aleks
 
  --
  Aleksander M. Stensby
  Lead Software Developer and System Architect
  Integrasco A/S
  www.integrasco.com
  http://twitter.com/Integrasco
  http://facebook.com/Integrasco
 
  Please consider the environment before printing all or any of this
 e-mail
 




 --
 Aleksander M. Stensby
 Lead Software Developer and System Architect
 Integrasco A/S
 www.integrasco.com
 http://twitter.com/Integrasco
 http://facebook.com/Integrasco

 Please consider the environment before printing all or any of this e-mail




-- 
Aleksander M. Stensby
Lead Software Developer and System Architect
Integrasco A/S
E-mail: aleksander.sten...@integrasco.com
Tel.: +47 41 22 82 72
www.integrasco.com
http://twitter.com/Integrasco
http://facebook.com/Integrasco

Please consider the environment before printing all or any of this e-mail


Re: Can solr do the equivalent of select distinct(field)?

2009-08-28 Thread Aleksander Stensby
but you could use facets to do something similar as a distinct where...
lets say you filter your query on something and want to know how many
distinct categories that your results comprise.
then you can facet on the category field and count the number of facet
values that are returned, right?
but maybe that's not what you are after...
cheers,
 Aleks

On Fri, Aug 28, 2009 at 11:22 AM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 On Fri, Aug 28, 2009 at 5:05 AM, Paul Tomblin ptomb...@xcski.com wrote:

  Can I get all the distinct values from the Solr database, or do I
  have to select everything and aggregate it myself?
 
 
 No, Solr has no way to do a distinct at query-time.

 --
 Regards,
 Shalin Shekhar Mangar.




-- 
Aleksander M. Stensby
Lead Software Developer and System Architect
Integrasco A/S
www.integrasco.com
http://twitter.com/Integrasco
http://facebook.com/Integrasco

Please consider the environment before printing all or any of this e-mail


Trie Date question

2009-08-27 Thread Aleksander Stensby
Hello everyone,
after reading Grant's article about TrieRange capabilities on the lucid blog
I did some experimenting, but I have some trouble with the tdate type and I
was hoping that you guys could point me in the right direction.
So, basically I index a regular solr date field and use that for sorting and
range queries today. For experimenting I added tdate field, indexing it with
the same data as in my other date field, but I'm obviously doing something
wrong here, because the results coming back are completely different...
the definitions in my schema:
field name=datetime type=date indexed=true stored=false
omitNorms=true/
field name=tdatetime type=tdate indexed=true stored=false/

so if I do a query on my test index:
q=datetime:[NOW/DAY-1YEAR TO NOW/DAY]
i get numFound=1031524 (don't worry about the ordering yet)..
then, if I do the following on my trie date field:
q=tdatetime:[NOW/DAY-1YEAR TO NOW/DAY]
i get numFound=0
Where did I go wrong? (And yes, both fields are indexed with the exactly
same data...)
Thanks for any guidance here!
Cheers,
 Aleks

-- 
Aleksander M. Stensby
Lead Software Developer and System Architect
Integrasco A/S
www.integrasco.com
http://twitter.com/Integrasco
http://facebook.com/Integrasco

Please consider the environment before printing all or any of this e-mail