Re: Filter to cut out all zeors?

2010-03-10 Thread Norberto Meijome
won't this replace *all* 0s ? ie, 1024 will become 124 ?
_
{Beto|Norberto|Numard} Meijome

The only people that never change are the stupid and the dead
 Jorge Luis Borges.

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


On 11 March 2010 03:24, Sebastian F qba...@yahoo.com wrote:

 yes, thank you. That was exactly what I was looking for! Great help!




 
 From: Ahmet Arslan iori...@yahoo.com
 To: solr-user@lucene.apache.org
 Sent: Tue, March 9, 2010 7:26:46 PM
 Subject: Re: Filter to cut out all zeors?

  I'm trying to figure out the best way to cut out all zeros
  of an input string like 01.10. or 022.300...
  Is there such a filter in Solr or anything similar that I
  can adapt to do the task?

 With solr.MappingCharFilterFactory[1] you can replace all zeros with 
 before tokenizer.

 charFilter class=solr.MappingCharFilterFactory mapping=mapping.txt/

 SolrHome/conf/mapping.txt file will contain this line:

 0 = 

 So that 01.10. will become 1.1. and  022.300 will become 22.3 Is
 that you want?

 [1]
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.MappingCharFilterFactory






Re: weird problem with letters S and T

2009-10-28 Thread Norberto Meijome
On Wed, 28 Oct 2009 19:20:37 -0400
Joel Nylund jnyl...@yahoo.com wrote:

 Well I tried removing those 2 letters from stopwords, didnt seem to  
 help, I also tried changing the field type to text_ws, didnt seem to  
 work. Any other ideas?


Hi Joel,
if your stop word filter was applied on index, you will have to reindex again 
(at least those documents with S and T).

If your stop filter was *only* on query, then it should work after you reloaded 
your app.

b

_
{Beto|Norberto|Numard} Meijome

Those who do not remember the past are condemned to repeat it.
   George Santayana

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: 99.9% uptime requirement

2009-08-04 Thread Norberto Meijome
On Mon, 3 Aug 2009 13:15:44 -0700
Robert Petersen rober...@buy.com wrote:

 Thanks all, I figured there would be more talk about daemontools if there
 were really a need.  I appreciate the input and for starters we'll put two
 slaves behind a load balancer and grow it from there.
 

Robert,
not taking away from daemon tools, but daemon tools won't help you if your
whole server goes down.

 don't put all your eggs in one basket - several
servers, load balancer (hardware load balancers x 2, haproxy, etc)

and sure, use daemon tools to keep your services running within each server...

B
_
{Beto|Norberto|Numard} Meijome

Why do you sit there looking like an envelope without any address on it?
  Mark Twain

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Updating Solr index from XML files

2009-07-08 Thread Norberto Meijome
On Tue, 7 Jul 2009 22:16:04 -0700
Francis Yakin fya...@liquid.com wrote:

 
 I have the following curl cmd to update and doing commit to Solr ( I have
 10 xml files just for testing)

[...]

hello,
DIH supports XML, right? 

not sure if it works with n files...but it's worth looking at it. 
alternatively, u can write a relatively simple java app that will pick each 
file up and post it for you using SolrJ
b

_
{Beto|Norberto|Numard} Meijome

Mix a little foolishness with your serious plans;
it's lovely to be silly at the right moment.
   Horace

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Is there any other way to load the index beside using http connection?

2009-07-06 Thread Norberto Meijome
On Sun, 5 Jul 2009 21:36:35 +0200
Marcus Herou marcus.he...@tailsweep.com wrote:

 Sharing some of our exports from DB to solr. Note: many of the statements
 below might not work due to clip-clip.

thx Marcus - but that's a DIH config right? :)
b
_
{Beto|Norberto|Numard} Meijome

I respect faith, but doubt is what gives you an education.
   Wilson Mizner

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Is there any other way to load the index beside using http connection?

2009-07-06 Thread Norberto Meijome
On Sun, 5 Jul 2009 10:28:16 -0700
Francis Yakin fya...@liquid.com wrote:

[...] 
 upload the file to your SOLR server? Then the data file is local to your SOLR
 server , you will bypass any WAN and firewall you may be having. (or some
 variation of it, sql - SOLR server as file, etc..)
 
 How we upload the file? Do we need to convert the data file to Lucene Index
 first? And Documentation how we do this?

pick your poison... rsync? ftp? scp ? 

B
_
{Beto|Norberto|Numard} Meijome

The freethinking of one age is the common sense of the next.
   Matthew Arnold

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Is there any other way to load the index beside using http connection?

2009-07-06 Thread Norberto Meijome
On Mon, 6 Jul 2009 09:56:03 -0700
Francis Yakin fya...@liquid.com wrote:

  Norberto,
 
 Thanks, I think my questions is:
 
 why not generate your SQL output directly into your oracle server as a file
 
 What type of file is this?
 
 

a file in a format that you can then import into SOLR. 

_
{Beto|Norberto|Numard} Meijome

Gravity cannot be blamed for people falling in love.
  Albert Einstein

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Is there any other way to load the index beside using http connection?

2009-07-05 Thread Norberto Meijome
On Thu, 2 Jul 2009 11:28:51 -0700
Francis Yakin fya...@liquid.com wrote:

  Norberto,
 


Hi Francis,
Please reply to the list, or keep it in CC.

 You saying:
 
 Other alternatives are to transform the XML into csv and import it that way
 
 How do you transfer that CSV file to Solr?
 

http://wiki.apache.org/solr/UpdateCSV 

There actually is a LOT of information in the wiki, as well as the mailing list 
archives.

good luck,
B
_
{Beto|Norberto|Numard} Meijome

The freethinking of one age is the common sense of the next.
   Matthew Arnold

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Is there any other way to load the index beside using http connection?

2009-07-05 Thread Norberto Meijome
On Thu, 2 Jul 2009 11:02:28 -0700
Francis Yakin fya...@liquid.com wrote:

 Norberto, Thanks for your input.
 
 What do you mean with Have you tried connecting to  SOLR over HTTP from
 localhost, therefore avoiding any firewall issues and network latency ? it
 should work a LOT faster than from a remote site. ?

 
 Here are how our servers lay out:
 
 1) Database ( Oracle ) is running on separate machine
 2) Solr master is running on separate machine by itself
 3) 6 solr slaves ( these 6 pulll the index from master using rsync)
 
 We have a SQL(Oracle) script to post the data/index from Oracle Database
 machine to Solr Master over http. We wrote those script(Someone in Oracle
 Database administrator write it).

You said in your other email you are having issues with slow transfers between
1) and 2). Your subject relates to the data transfer between 1) and 2, - 2) and
3) is irrelevant to this part.

My question (what you quoted above) relates to the point you made about it
being slow ( WHY is it slow?), and issues with opening so many connections
through firewall. so, I'll rephrase my question (see below...)

[]
 
 We can not do localhost since it's solr is not running on Oracle machine.

why not generate your SQL output directly into your oracle server as a file,
upload the file to your SOLR server? Then the data file is local to your SOLR
server , you will bypass any WAN and firewall you may be having. (or some
variation of it, sql - SOLR server as file, etc..)

Any speed issues that are rooted in the fact that you are posting via
HTTP (vs embedded solr or DIH) aren't going to go away. But it's the simpler
approach without changing too much of your current setup.

 
 Another alternative that we think of is to transform XML into CSV and
 import/export it.
 
 How about if LUSQL, some mentioned about this? Is this apps free(open source)
 application? Do you have any experience with this apps?

Not i, sorry.

Have you looked into DIH? It's designed for this kind of work.

B
_
{Beto|Norberto|Numard} Meijome

Great spirits have often encountered violent opposition from mediocre minds.
  Albert Einstein

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Is there any other way to load the index beside using http connection?

2009-07-02 Thread Norberto Meijome
On Wed, 1 Jul 2009 15:07:12 -0700
Francis Yakin fya...@liquid.com wrote:

 
 We have several thousands of  xml files in database that we load it to solr
 master The Database uses http  connection and transfer those files to solr
 master. Solr then  translate xml files to their lindex.
 
 We are experiencing issue with close/open connection in the firewall and very
 very slow.
 
 Is there any other way to load the data/index from Database to solr master
 beside using http connection, so it means we just scp/ftp the xml file  from
 Database system to solr master  and let solr convert those to lucene indexes?
 

Francis,
after reading the whole thread, it seems you have :
  - Data source : Oracle DB, on separate location to your SOLR.
  - Data format : XML output.
  
definitely DIH is a great option, but since you are on 1.2, not available to 
you (you should look into upgrading if you can!). 

Have you tried connecting to  SOLR over HTTP from localhost, therefore avoiding 
any firewall issues and network latency ? it should work a LOT faster than from 
a remote site. Also make sure not to commit until you really needed.

Other alternatives are to transform the XML into csv and import it that way. Or 
write a simple app that will parse the xml and post it directly using the 
embedded solr method.

plenty of options, all of them documented @ solr's site.

good luck,
b 
_
{Beto|Norberto|Numard} Meijome

People demand freedom of speech to make up for the freedom of thought which 
they avoid.  
  Soren Aabye Kierkegaard

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Is it problem? I use solr to search and index is made by lucene. (not EmbeddedSolrServer(wiki is old))

2009-07-02 Thread Norberto Meijome
On Thu, 2 Jul 2009 16:12:58 +0800
James liu liuping.ja...@gmail.com wrote:

 I use solr to search and index is made by lucene. (not
 EmbeddedSolrServer(wiki is old))
 
 Is it problem when i use solr to search?
 
 which the difference between Index(made by lucene and solr)?

Hi James,
make sure the version of Lucene used to create your index is the same as the
libraries included in your version of SOLR. it should work.

it may be that an older lucene index works with a newer lucene-provided-in-solr
libs, but after using it you may not be able to go back , but i am not sure of
the details.

probably an FAQ by now - check the archives  :)

good luck,
B
_
{Beto|Norberto|Numard} Meijome

He has no enemies, but is intensely disliked by his friends.
  Oscar Wilde

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Solr document security

2009-06-25 Thread Norberto Meijome
On Wed, 24 Jun 2009 23:20:26 -0700 (PDT)
pof melbournebeerba...@gmail.com wrote:

 
 Hi, I am wanting to add document-level security that works as following: An
 external process makes a query to the index, depending on their security
 allowences based of a login id a list of hits are returned minus any the
 user are meant to know even exist. I was thinking maybe a custom filter with
 a JDBC connection to check security of the user vs. the document. I'm not
 sure how I would add the filter or how to write the filter or how to get the
 login id from a GET parameter. Any suggestions, comments etc.?

Hi Brett,
(keeping in mind that i've been away from SOLR for 8 months, but i
dont think this was added of late)

standard approach is to manage security @ your
application layer, not @ SOLR. ie, search, return documents (which should
contain some kind of data to identify their ACL ) and then you can decide
whether to show it or not. 

HIH
_
{Beto|Norberto|Numard} Meijome

They never open their mouths without subtracting from the sum of human
knowledge. Thomas Brackett Reed

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: How can i indexing MS-Outlook files?

2008-12-23 Thread Norberto Meijome
On Sun, 14 Dec 2008 19:22:00 -0800 (PST)
Otis Gospodnetic otis_gospodne...@yahoo.com wrote:

 Perhaps an easier alternative is to index not the MS-Outlook files
 themselves, but email messages pulled from the IMAP or POP servers, if that's
 where the original emails live.

PST files ('outlook files') are local to the end user and quite possibly their
contents aren't available in the server anymore.

Another alternative could be to access, from Exchange's
file system itself, the files that represent each object... I don't know
whether this is still possible in Exchange 2007, or whether it is 'sanctioned'
by MS... Possibly some kind of object interface with exchange itself would be
most desirable


_
{Beto|Norberto|Numard} Meijome

FAST, CHEAP, SECURE: Pick Any TWO

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Using Solr for indexing emails

2008-11-25 Thread Norberto Meijome
On Tue, 25 Nov 2008 03:59:31 +0200
Timo Sirainen [EMAIL PROTECTED] wrote:

  would it be faster to say q=user:user AND highestuid:[ * TO *]  ?  
 
 Now that I read again what fq really did, yes, sounds like you're right.

you may want to compare them both to see which one is better... I just went
from memory :P

  ( and i
  guess you'd sort DESC and return 1 record only).  
 
 No, I'd use the above for getting highestuid value for all mailboxes
 (there should be only one record per mailbox (each mailbox has separate
 uid values - separate highestuid value)) so I can look at the returned
 highestuid values to see what mailboxes aren't fully indexed yet.

gotcha. It is an interesting use of SOLR, i must say... I for one am not used
to having to deal with up to the second update needs.

good luck,
B

_
{Beto|Norberto|Numard} Meijome

Never offend people with style when you can offend them with substance.
  Sam Brown

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: port of Nutch CommonGrams to Solr for help with slow phrase queries

2008-11-25 Thread Norberto Meijome
On Mon, 24 Nov 2008 13:31:39 -0500
Burton-West, Tom [EMAIL PROTECTED] wrote:

 The approach to this problem used by Nutch looks promising.  Has anyone
 ported the Nutch CommonGrams filter to Solr?
 
 Construct n-grams for frequently occuring terms and phrases while
 indexing. Optimize phrase queries to use the n-grams. Single terms are
 still indexed too, with n-grams overlaid.
 http://lucene.apache.org/nutch/apidocs-0.8.x/org/apache/nutch/analysis/C
 ommonGrams.html

Tom,
i haven't used Nutch's implementation, but used the current implementation
(1.3) of ngrams and shingles to address exactly the same issue ( database of
music albums and tracks). 
We didn't notice any severe performance hit but :
- data set isn't huge ( ca 1 MM docs).
- reindexed nightly via DIH from MS-SQL, so we can use a separate cache layer to
lower the number of hits to SOLR.

B
_
{Beto|Norberto|Numard} Meijome

Truth has no special time of its own.  Its hour is now -- always.
   Albert Schweitzer

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: port of Nutch CommonGrams to Solr for help with slow phrase queries

2008-11-25 Thread Norberto Meijome
On Wed, 26 Nov 2008 10:08:03 +1100
Norberto Meijome [EMAIL PROTECTED] wrote:

 We didn't notice any severe performance hit but :
 - data set isn't huge ( ca 1 MM docs).
 - reindexed nightly via DIH from MS-SQL, so we can use a separate cache layer
 to lower the number of hits to SOLR.

To make this clear - there was a noticeable hit when we removed stop words, but
the nature of the beast forced our hand.

b

_
{Beto|Norberto|Numard} Meijome

Peace can only be achieved by understanding.
   Albert Einstein

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Using Solr for indexing emails

2008-11-24 Thread Norberto Meijome
On Mon, 24 Nov 2008 20:21:17 +0200
Timo Sirainen [EMAIL PROTECTED] wrote:

 I think I gave enough reasons above for why I don't like this
 solution. :) I also don't like adding new shared global state databases
 just for Solr. Solr should be the one shared global state database..

fair enough - it makes more sense to me now :)

[...]
 Store the per-mailbox highest indexed UID in a new unique field created
 like user/uidvalidity/mailbox. Always update it by deleting the
 old one first and then adding the new one.

you mean delete, commit, add, commit? if you replace the record, simply
submitting the new document and committing would do (of course, you must ensure
the value of the  uniqueKey field matches, so SOLR replaces the old doc).

 So to find out the highest
 indexed UID for a mailbox just look it up using its unique field. For
 finding the highest indexed UID for a user's all mailboxes do a single
 query:
 
  - fl=highestuid
  - q=highestuid:[* TO *]
  - fq=user:user

would it be faster to say q=user:user AND highestuid:[ * TO *]  ?  ( and i
guess you'd sort DESC and return 1 record only).

 If messages are being simultaneously indexed by multiple processes the
 highest-uid value may sometimes (rarely) be set too low, but that
 doesn't matter. The next search will try to re-add some of the messages
 that were already in index, but because they'll have the same unique IDs
 than what already exists they won't get added again. The highest-uid
 gets updated and all is well.

B
_
{Beto|Norberto|Numard} Meijome

Mind over matter: if you don't mind, it doesn't matter

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: [VOTE] Community Logo Preferences

2008-11-23 Thread Norberto Meijome
On Sun, 23 Nov 2008 11:59:50 -0500
Ryan McKinley [EMAIL PROTECTED] wrote:

 Please submit your preferences for the solr logo.

https://issues.apache.org/jira/secure/attachment/12394267/apache_solr_c_blue.jpg
https://issues.apache.org/jira/secure/attachment/12394263/apache_solr_a_blue.jpg
https://issues.apache.org/jira/secure/attachment/12394070/sslogo-solr-finder2.0.png
https://issues.apache.org/jira/secure/attachment/12394376/solr_sp.png
https://issues.apache.org/jira/secure/attachment/12394264/apache_solr_a_red.jpg

thanks!!
B

_
{Beto|Norberto|Numard} Meijome

Tell a person you're the Metatron and they stare at you blankly. Mention 
something out of a Charleton Heston movie and suddenly everyone's a Theology 
scholar!
   Dogma

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Using Solr for indexing emails

2008-11-23 Thread Norberto Meijome
On Sun, 23 Nov 2008 16:02:16 +0200
Timo Sirainen [EMAIL PROTECTED] wrote:

 Hi,

Hi Timo,

 
[...]

 The main problem is that before doing the search, I first have to check
 if there are any unindexed messages and then add them to Solr. This is
 done using a query like:
  - fl=uid
  - rows=1
  - sort=uid desc
  - q=uidv:uidvalidity box:mailbox user:user

So, if I understand correctly, the process is :

1. user sends search query Q to search interface
2. interface checks highest indexed uidv in SOLR
3. checks in IMAP store for mailbox if there are any objects ('emails') newer
than uidv from 2.
4. anything found in 3. is processed, submitted to SOLR, committed.
5. interface submits search query Q to index, gets results
6. results are presented / returned to user

It strikes me that this may work ok in some situations but may not scale. I
would decouple the {find new documents / submit / commit } process from the
{ search / presentation} layer - SPECIALLY if you plan to have several
mailboxes in play now.

 So it returns the highest IMAP UID field (which is an always-ascending
 integer) for the given mailbox (you can ignore the uidvalidity). I can
 then add all messages with higher UIDs to Solr before doing the actual
 search.
 
 When searching multiple mailboxes the above query would have to be sent
 to every mailbox separately. 

hmm...not sure what you mean by query would have to be sent to every
MAILBOX ... 

 That really doesn't seem like the best
 solution, especially when there are a lot of mailboxes. But I don't
 think Solr has a way to return highest uid field for each
 box:mailbox?

hmmm... maybe you can use facets on 'box' ... ? though you'd still have to
query for each box, i think...

 Is that above query even efficient for a single mailbox? 

i don't think so.

I did consider
 using separate documents for storing the highest UID for each mailbox,
 but that causes annoying desynchronization possibilities. Especially
 because currently I can just keep sending documents to Solr without
 locking and let it drop duplicates automatically (should be rare). With
 per-mailbox highest-uid documents I can't really see a way to do this
 without locking or allowing duplicate fields to be added and later some
 garbage collection deleting all but the one highest value (annoyingly
 complex).

I have a feeling the issues arise from serialising the whole process (as I
described above... ). It makes more sense (to me)  to implement something
similar to DIH, where you load data as needed (even a 'delta query', which
would only return new data... I am not sure whether you could use DIH ( RSS
feed from IMAP store? )

 I could of course also keep track of what's indexed on Dovecot's side,
 but that could also lead to desynchronization issues and I'd like to
 avoid them.
 
 I guess the ideal solution would be if it was somehow possible to create
 a SQL-like trigger that updates the per-mailbox highest-uid document
 whenever adding a new document with a higher UID value.

I am not sure how much effort you want to put into this...but I would think
that writing a lean app that periodically (for a period that makes sense for
your hardware and user's expectation... 5 minutes? 10?  1? ) crawls the IMAP
stores for UID, processes them and submits to SOLR, and keeps its own state
( dbm or sqlite ) may be a more flexible approach. Or, if dovecot support this,
a 'plugin / hook ' that sends a msg to your indexing app everytime a new
document is created.

I am interested to hear what you decide to go with, and why.

cheers,
B

_
{Beto|Norberto|Numard} Meijome

All parts should go together without forcing. You must remember that the parts
you are reassembling were disassembled by you. Therefore, if you can't get them
together again, there must be a reason. By all means, do not use hammer. IBM
maintenance manual, 1975

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: How can i protect the SOLR Cores?

2008-11-20 Thread Norberto Meijome
On Wed, 19 Nov 2008 22:58:52 -0800 (PST)
RaghavPrabhu [EMAIL PROTECTED] wrote:

  Im using multiple cores and all i need to do is,to make the each core in
 secure manner. If i am accessing the particular core via url,it should ask
 and validate the credentials say Username  Password for each core.

You should be able to handle this @ the servlet container level. What I did, 
using Jetty + starting from the example app, was :

1) modify web.xml (part of the sources of solr.war, which you'll have to 
rebuild)   to define the authentication constraints you want. 

[...]
!--  block by default. --
security-constraint
  web-resource-collection
   web-resource-nameDefault/web-resource-name
url-pattern//url-pattern
  /web-resource-collection
  auth-constraint/  !--  BLOCK! --
/security-constraint

!--  this constraint has no auth constraint or data constraint = 
allows without auth.  --
security-constraint
  web-resource-collection
web-resource-nameAllowedQueries/web-resource-name
url-pattern/core1/select/*/url-pattern
url-pattern/core2/select/*/url-pattern
url-pattern/core3/select/*/url-pattern
  /web-resource-collection
/security-constraint

!--  this constraint allows access to admin pages, with basic auth  --
security-constraint
web-resource-collection
web-resource-nameAdmin/web-resource-name
!--  the admin for cores management --
url-pattern/admin/*/url-pattern
!--  the admin for each individual core --
url-pattern/core1/admin/*/url-pattern
url-pattern/core2/admin/*/url-pattern
url-pattern/core3/admin/*/url-pattern
!-- The Test core, full access to it --
url-pattern/_test_/*/url-pattern
/web-resource-collection
auth-constraint
!-- Roles of users are defined int the properties file 
--
!--  we allow users with admin-only access --
role-nameAdmin-role/role-name
!--  we allow users with full access --
role-nameFullAccess-role/role-name
/auth-constraint
/security-constraint

!--  this constraint allows access to modify the data in the SOLR 
service, with basic auth  --
security-constraint
web-resource-collection
web-resource-nameRW/web-resource-name
!--  the dataimport handler for each individual core 
--
url-pattern/core1/dataimport/url-pattern
url-pattern/core2/dataimport/url-pattern
url-pattern/core3/dataimport/url-pattern
!-- the update handler (XML over HTTP) for each 
individual core --
url-pattern/core1/update/*/url-pattern
url-pattern/core2/update/*/url-pattern
url-pattern/core3/update/*/url-pattern
/web-resource-collection
auth-constraint
!-- Roles of users are defined int the properties file 
--
!--  we allow users with rw-only access --
role-nameRW-role/role-name
!--  we allow users with full access --
role-nameFullAccess-role/role-name
/auth-constraint
/security-constraint

!--  the Realm for this app. Ideally we should have different realms 
for each security-constraint, but I can't get it to work properly --
login-config
auth-methodBASIC/auth-method
realm-nameSearchSvc/realm-name
/login-config
security-role
role-nameAdmin-role/role-name
/security-role
security-role
role-nameFullAccess-role/role-name
/security-role
security-role
role-nameRW-role/role-name
/security-role

[...]

2) in Jetty's jetty.xml (or in a context...i just used jetty.xml), define where 
to get the AUTH details from :
[...]
Set name=UserRealms
  Array type=org.mortbay.jetty.security.UserRealm
Item
New class=org.mortbay.jetty.security.HashUserRealm
Set name=nameSearchSvc/Set
Set name=config
SystemProperty name=jetty.home default=. 
//etc/searchsvc_access.properties/Set
!--Set name=reloadInterval10/Set--
!--Call name=start/Call--
/New
/Item
[...]


3) Read in jetty's documentation how to create the .properties file with the 
auth info...

I am not sure if this is the BEST way 

Re: Use SOLR like the MySQL LIKE

2008-11-19 Thread Norberto Meijome
On Tue, 18 Nov 2008 14:26:02 +0100
Aleksander M. Stensby [EMAIL PROTECTED] wrote:

 Well, then I suggest you index the field in two different ways if you want  
 both possible ways of searching. One, where you treat the entire name as  
 one token (in lowercase) (then you can search for avera* and match on for  
 instance average joe etc.) And then another field where you tokenize on  
 whitespace for instance, if you want/need that possibility aswell. Look at  
 the solr copy fields and try it out, it works like a charm :)

You should also make extensive use of  analysis.jsp  to see how data in your
field (1) is tokenized, filtered and indexed, and how your search terms are
tokenized, filtered and matched against (1). 
Hint 1 : check all the checkboxes ;)
Hint 2: you don't need to reindex all your data, just enter test data in the
form and give it a go. You will of course have to tweak schema.xml and restart
your service when you do this.

good luck,
B
_
{Beto|Norberto|Numard} Meijome

Intellectual: 'Someone who has been educated beyond his/her intelligence'
   Arthur C. Clarke, from 3001, The Final Odyssey, Sources.

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Solr Core Size limit

2008-11-12 Thread Norberto Meijome
On Tue, 11 Nov 2008 10:25:07 -0800 (PST)
Otis Gospodnetic [EMAIL PROTECTED] wrote:

 Doc ID gaps are zapped during segment merges and index optimization.
 

thanks Otis :)
b
_
{Beto|Norberto|Numard} Meijome

I didn't attend the funeral, but I sent a nice letter saying  I approved of 
it.
  Mark Twain

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Solr Core Size limit

2008-11-12 Thread Norberto Meijome
On Tue, 11 Nov 2008 20:39:32 -0800 (PST)
Otis Gospodnetic [EMAIL PROTECTED] wrote:

 With Distributed Search you are limited to # of shards * Integer.MAX_VALUE.

yeah, makes sense. And i would suspect since this is PER INDEX , it applies to 
each core only ( so you could have n cores in m shards for n * m * 
integer.MAX_VALUE docs).


_
{Beto|Norberto|Numard} Meijome

The more I see the less I know for sure. 
  John Lennon

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Solr Core Size limit

2008-11-10 Thread Norberto Meijome
On Mon, 10 Nov 2008 10:24:47 -0800 (PST)
Otis Gospodnetic [EMAIL PROTECTED] wrote:

 I don't think there is a limit other than your hardware and the internal Doc
 ID which limits you to 2B docs on 32-bit machines.

Hi Otis,
just curious is this internal doc ID reused when an optimise happens? or 
gaps left and re-filled when 2B is reached ? 

cheers,
b

_
{Beto|Norberto|Numard} Meijome

Whenever you find that you are on the side of the majority, it is time to 
reform.
   Mark Twain

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: How to use multicore feature in JBOSS

2008-11-05 Thread Norberto Meijome
On Tue, 4 Nov 2008 23:45:40 -0800 (PST)
con [EMAIL PROTECTED] wrote:

 But for the first question, I am still not clear.
 I think to use the multicore feature we should inform the server. In the
 Jetty server, we are starting the server using:   java
 -Dsolr.solr.home=multicore -jar start.jar
 Once the server is started I think it will take the parameters from
 multicore/solr.xml.
 
 But I am confused on how and where to pass this argument to JBOSS. 

Con,
Sorry, i don't have a jboss available to test... what happens if you use the 
standard configuration ( with solr.xml @ the top level of your solr directory, 
NOT in multicore/ ) 

launch it, look @ the debug messages , see which cores are picked up (from the 
admin page ). 

FWIW, by having {solr_installation_directory}/solr.xml , I never had to tell 
jetty where solr.xml was.  IIRC, multicore/solr.xml is the layout in the 
example app , because the default config is 1-core only.

b

_
{Beto|Norberto|Numard} Meijome

We must openly accept all ideologies and systems as  means of solving 
humanity's problems. One country, one nation, one ideology, one system is not 
sufficient.
   Dalai Lama.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: How to use multicore feature in JBOSS

2008-11-04 Thread Norberto Meijome
On Tue, 4 Nov 2008 09:55:38 -0800 (PST)
con [EMAIL PROTECTED] wrote:

 1) Which all files do I need to edit to use the multicore feature?
 2) Also, where can I specify the index directly so that we can point the
 indexed documents to a custom folder instead of jboss/bin?

Con, please check the wiki - the answers should be there 

(
 1) = solr.xml ( previously multicore.xml)
2) look in solrconfig.xml for each core
)
_
{Beto|Norberto|Numard} Meijome

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Solr Searching on other fields which are not in query

2008-10-30 Thread Norberto Meijome
On Thu, 30 Oct 2008 15:50:58 -0300
Jorge Solari [EMAIL PROTECTED] wrote:

 copyField source=* dest=text/
 
 in the schema file.

or use Dismax query handler.
b

_
{Beto|Norberto|Numard} Meijome

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: DIH and rss feeds

2008-10-30 Thread Norberto Meijome
On Thu, 30 Oct 2008 20:46:16 -0700
Lance Norskog [EMAIL PROTECTED] wrote:

 Now: a few hours later there are a different 100 lastest documents. How do
 I add those to the index so I will have 200 documents?  'full-import' throws
 away the first 100. 'delta-import' is not implemented. What is the special
 trick here?  I'm using the Solr-1.3.0 release.
  

Lance,

1) DIH has a clean parameter that, when set to true ( default, i think), will 
delete all existing docs in the index.

2) ensure your new documents have different values in your field defined as key 
( schema.xml) .

let us know how it goes,
B

_
{Beto|Norberto|Numard} Meijome

Lack of planning on your part does not constitute an emergency on ours.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: search not working correctly

2008-10-20 Thread Norberto Meijome
On Mon, 20 Oct 2008 03:24:36 -0700 (PDT)
prerna07 [EMAIL PROTECTED] wrote:

 Yes, We want search on these incomplete words.

Look into the NGram token factory . works a treat - I don't think it's
explained a lot in the wiki, but has been discussed in this list in the past,
and you also have JavaDoc and the source itself.

FWIW, I had problems getting it to work properly with minNgram != maxNGram
- analysis.jsp shows a match, but it didn't work in the QH . It could
*definitely* have been myself or code @ the time I tested it (pre 1.3
release)... I'll test again to see if it is happening and log a bug if
needed.

B

_
{Beto|Norberto|Numard} Meijome

There are two kinds of stupid people. One kind says,'This is old and therefore
good'. The other kind says, 'This is new, and therefore better.'
 John Brunner, 'The Shockwave Rider'.

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Synonym format not working

2008-10-20 Thread Norberto Meijome
On Mon, 20 Oct 2008 00:08:07 -0700 (PDT)
prerna07 [EMAIL PROTECTED] wrote:

 
 
 The issue with synonym arise when i have number in synonym defination:
 
 ccc =1,2 gives following result in debugQuery= true :
  str name=parsedqueryMultiPhraseQuery(all: (1 ) (2 ccc )
 3)/str 
   str name=parsedquery_toStringall: (1 ) (2 ccc ) 3/str 
 
 However fooaaa= fooaaa, baraaa,bazaaa gives correct synonym results:
 
   str name=parsedqueryall:fooaaa all:baraaa all:bazaaa/str 
   str name=parsedquery_toStringall:fooaaa all:baraaa all:bazaaa/str 
 
 Any pointers to solve the issue with numbers in synonyms?

Prerna,
in your first email you show your field type has :

[...]
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=0
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
[..]

generateNumberParts=1 will, AFAIK, generate a different token on a number. so
ccc1 will be indexed as ccc, 1  . If you use admin/analsys.jsp you can see
the step by step process taken by the tokenizer + filters for your data type -
you can then tweak it as necessary until you are happy with the results.

b
_
{Beto|Norberto|Numard} Meijome

Immediate success shouldn't be necessary as a motivation to do the right thing.

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: query parsing issue + behavior as OR (solr 1.4-dev)

2008-10-20 Thread Norberto Meijome
On Mon, 20 Oct 2008 06:21:06 -0700 (PDT)
Sunil Sarje [EMAIL PROTECTED] wrote:

 I am working with nightly build of Oct 17, 2008  and found the issue that
 something wrong with LuceneQParserPlugin; It takes + as OR

Sunil, please do not hijack the thread :

http://en.wikipedia.org/wiki/Thread_hijacking

thanks,
B

_
{Beto|Norberto|Numard} Meijome

He could be a poster child for retroactive birth control.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: solr1.3 - testing language ?

2008-10-20 Thread Norberto Meijome
On Mon, 20 Oct 2008 06:25:09 -0700 (PDT)
sunnyfr [EMAIL PROTECTED] wrote:

 I implemented multi language search, but I didn't finished the website in
 PHP, how can I check it works properly?

maybe by sending to SOLR the queries you plan your PHP frontend to generate ? 

_
{Beto|Norberto|Numard} Meijome

Always do right.  This will gratify some and astonish the rest.
  Mark Twain

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Sorting performance

2008-10-20 Thread Norberto Meijome
On Mon, 20 Oct 2008 16:28:23 +0300
christophe [EMAIL PROTECTED] wrote:

 Hum. this mean I have to wait before I index new documents and avoid 
 indexing when they are created (I have about 50 000 new documents 
 created each day and I was planning to make those searchable ASAP).

you can always index + optimize out of band in a 'master' / RW server , and
then send the updated index to your slave (the one actually serving the
requests). 

This *will NOT* remove the need to refresh your cache, but it will remove any
delay introduced by commit/indexing + optimise.

 Too bad there is no way to have a centralized cache that can be shared 
 AND updated when new documents are created.

hmm not sure it makes sense like that... but maybe along the lines of having an
active cache that is used to serve queries, and new ones being prepared, and
then swapped when ready. 

Speaking of which (or not :P) , has anyone thought about / done any work on
using memcached for these internal solr caches? I guess it would make sense for
setups with several slaves ( or even a master updating memcached
too...)...though for a setup with shards it would be slightly more involved
(although it *could* be used to support several slaves per 'data shard' ).

All the best,
B
_
{Beto|Norberto|Numard} Meijome

RTFM and STFW before anything bad happens.

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Advice on analysis/filtering?

2008-10-20 Thread Norberto Meijome
On Thu, 16 Oct 2008 16:09:17 +0200
Jarek Zgoda [EMAIL PROTECTED] wrote:

 They came to such expectations seeing Solr's own Spellcheck at work -  
 if it can suggest correct versions, it should be able to sanitize  
 broken words in documents and search them using sanitized input. For  
 me, this seemed reasonable request (of course, if this can be achieved  
 reasonably abusing solr's spellcheck component).

don't forget that the solr spellchecker finds its suggestions based on your
corpus. so if you don't have a correctly spelt version of wordA , you won't
receive back wordA as a 'spellchecked' version of that word. I think that's how
it works by default (which is all I've needed so far).
I *think* there is a way to use an external spellchecker (component or list) -
so you could have your full list of Polish words in a file, i guess

I agree playing with analysis.jsp is the best approach to solving these
problems ( tick all the boxes and see how the changes to your terms take place).

good luck - let us know what you come up with :)

B
_
{Beto|Norberto|Numard} Meijome

You can discover what your enemy fears most by observing the means he uses to
frighten you. Eric Hoffer (1902 - 1983)

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: solr1.3 - testing language ?

2008-10-20 Thread Norberto Meijome
On Mon, 20 Oct 2008 08:16:50 -0700 (PDT)
sunnyfr [EMAIL PROTECTED] wrote:

 ok so straight by the admin part !

Hi Johanna - not sure what you mean by 'the admin part'. 

 it should work .. so it doesn't 

if you tell us what you did (what url you called) , what you expect to receive
back (sample of your indexed data) and what you get instead , we may be able to
offer better answers...

b

_
{Beto|Norberto|Numard} Meijome

Two things have come out of Berkeley, Unix and LSD.
It is uncertain which caused the other.

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: dismax and long phrases

2008-10-09 Thread Norberto Meijome
On Tue, 07 Oct 2008 09:27:30 -0700
Jon Drukman [EMAIL PROTECTED] wrote:

  Yep, you can fake it by only using fieldsets (qf) that have a 
  consistent set of stopwords.  
 
 does that mean changing the query or changing the schema?

Jon,
- you change schema.xml to define which type each field is. The fieldType says 
whether you have stopwords or not.
- you change solrconfig.xml to define which fields will dismax query on.

i dont think you should have to change your query.

b

_
{Beto|Norberto|Numard} Meijome

Mix a little foolishness with your serious plans;
it's lovely to be silly at the right moment.
   Horace

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Problem in using Unique key

2008-10-09 Thread Norberto Meijome
On Wed, 8 Oct 2008 03:45:20 -0700 (PDT)
con [EMAIL PROTECTED] wrote:

 But in that case, while doing a full-import I am getting the following
 error:
 
 org.apache.solr.common.SolrException: QueryElevationComponent requires the
 schema to have a uniqueKeyField 

Con, if you don't use the Query Elevation component, you can disable it in 
solrconfig.xml . Not sure why uniqueField is needed for it though.

b

_
{Beto|Norberto|Numard} Meijome

First they ignore you, then they laugh at you, then they fight you, then you 
win.
  Mahatma Gandhi.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Dismax , query phrases

2008-10-01 Thread Norberto Meijome
On Tue, 30 Sep 2008 11:43:57 -0700 (PDT)
Chris Hostetter [EMAIL PROTECTED] wrote:

 
 : That's why I was wondering how Dismax breaks it all apart. It makes
 sense...I : suppose what I'd like to have is a way to tell dismax which
 fields NOT to : tokenize the input for. For these fields, it would pass the
 full q instead of : each part of it. Does this make sense? would it be useful
 at all? 
 
 the *goal* makes sense, but the implementation would be ... problematic.
 
 you have to remember the DisMax parser's whole way of working is to make 
 each chunk of input match against any qf field, and find the highest 
 scoring field for each chunk, with this input...
 
   q = some phase   qf = a b c
 
 ...you get...
 
   ( (a:some | b:some | c:some) (a:phrase | b:phrase | c:phrase) )
 
 ...even if dismax could tell that c was a field that should only support 
 exact matches,

thanks Hoss,

it would by a configuration option. 

 how would it fit c:some phrase into that structure?

does this make sense?

 ( (a:some | b:some ) (a:phrase | b:phrase) ( c:some phrase) )


 I've already kinda forgotten how this thread started ... 

trying to get *exact* matches to always score higher using dismax - keeping in
mind that I have multiple exact fields, with different boosts...

 but would it make 
 sense to just use your exact fields in the pf, and have inexact versions 
 of them in the qf?  then docs that match your input exactly should score 
 at the top, but less exact matches will also still match.

aha! right, i think that makes sense...i obviously haven't got my head properly
around all the different functionality of dismax.

I will try it when I'm back @ work... right now, i seem to have solved the
problem by using shingles -the fields are artists, song  albumtitles ,so high
matching on shingles is quite approximate to exact matching - except that I had
to remove stopwords, so that impacts on performance.

Thanks again :)
B
_
{Beto|Norberto|Numard} Meijome

Which is worse: ignorance or apathy?
Don't know. Don't care.

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Dismax , query phrases

2008-09-29 Thread Norberto Meijome
On Fri, 26 Sep 2008 10:42:42 -0700 (PDT)
Chris Hostetter [EMAIL PROTECTED] wrote:

 : tokenizer
 : class=solr.KeywordTokenizerFactory / !-- The LowerCase TokenFilter does
 
 : Now, when I search with ?q=the doors , all the terms in my q= aren't used
 : together to build the dismaxQuery , so I never get a match on the _exact
 fields:
 
 The query parser (even the dismax queryparser) does it's white space 
 chunking before handing any input off to the analyzer for the 
 appropriate field, so with [[ ?q=the doors ]] the and doors are going 
 to get analyzed seperately ... which is why you see artist_exact:the^100.0 
 and artist_exact:doors^100.0 in your parsedquery -- *BUT* since you used 
 KeywordTOkenizer at index time, you'll never get a match for either of 
 those on any document (unles the artist is just the or doors)

Hi Hoss :)
thanks for the feedback - I arrived @ the same conclusion . The biz requirement
is that these *_exact fields match exactly the original contents of the field.
Right now we are using Dismax, and changing this means rewriting a lot of the
queries , which isn't possible. 

That's why I was wondering how Dismax breaks it all apart. It makes sense...I
suppose what I'd like to have is a way to tell dismax which fields NOT to
tokenize the input for. For these fields, it would pass the full q instead of
each part of it. Does this make sense? would it be useful at all? 

 : I've tried with other queries that don't include stopwords (smashing
 pumpkins, : for example), and in all cases, if I don't use  , only the LAST
 word is used : with my _exact fields ( tried with 1, 2 and 3 words, always
 the same against my : _exact fields..)
 
 this LAST word part doesn't make sense to me ... you can see the 
 making it into your query on the *_exact fields in the first 
 DisjunctionMaxQuery, do you have toStrings for these other queries we 
 could see to understand what you mean?

I agree, it makes sense as you say...i must have missed the initial tokens. I
can't confirm atm, so I'll follow the common sense path :)

As usual, thanks for your time and insights :)

B
_
{Beto|Norberto|Numard} Meijome

Humans die and turn to dust, but writing makes us remembered
  4000-year-old words of an Egyptian scribe

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Create Indexes

2008-09-29 Thread Norberto Meijome
On Fri, 26 Sep 2008 18:58:14 +0530
Dinesh Gupta [EMAIL PROTECTED] wrote:

 Please tell me where to upload the files.

anywhere you have access to... your own website, somewhere anyone on the list 
can access the files you want to share to address your problems :)
b
_
{Beto|Norberto|Numard} Meijome

Science Fiction...the only genuine consciousness expanding drug
  Arthur C. Clarke

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: How to select one entity at a time?

2008-09-26 Thread Norberto Meijome
On Fri, 26 Sep 2008 00:46:07 -0700 (PDT)
con [EMAIL PROTECTED] wrote:

 To be more specific:
 I have the data-config.xml just like: 
 dataConfig
   dataSource **/
   document
   entity name=user query=select * from USER 
   /entity
   
   entity name=manager query=select * from MANAGERS
   /entity
 
   entity name=both query=select * from
 MANAGERS,USER where MANAGERS.userID= USER .userID
   /entity
   /document
 /dataConfig

Con, I may be confused here...are you asking how to load only data from your 
USERS SQL table into SOLR, or how to search in your SOLR index for data about 
'USERS'.

data-config.xml is only relevant for the Data Import Handler...but your 
following question:

 
 I have 3 search conditions. when the client wants to search all the users,
 only the entity, 'user' must be executed. And if he wants to search all
 managers, the entity, 'manager' must be executed.
 
 How can i accomplish this through url?

*seems* to indicate you want to search on this .

If you want to search on a particular field from your SOLR schema, DIH is not 
involved. If you use the standard QH, you say ?q=user:Bob 

If I misunderstood your question, please explain...

cheers,
b
_
{Beto|Norberto|Numard} Meijome

Everything is interesting if you go into it deeply enough
  Richard Feynman

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Create Indexes

2008-09-26 Thread Norberto Meijome
On Fri, 26 Sep 2008 16:32:05 +0530
Dinesh Gupta [EMAIL PROTECTED] wrote:

 Is it OK to create whole index by Solr web-app?
 If not than ,How can I create index?
 
 I have attached some file that create index now.
 

Dinesh,
you sent the same email 2 1/2 hours ago. sending it again will not give you 
more answers.

If you have a file you want to share, you should upload it to a webserver and 
share the URL - most mailing lists drop any file attachments.


_
{Beto|Norberto|Numard} Meijome

Never take Life too seriously, no one gets out alive anyway.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: How to select one entity at a time?

2008-09-26 Thread Norberto Meijome
On Fri, 26 Sep 2008 02:35:18 -0700 (PDT)
con [EMAIL PROTECTED] wrote:

 What you meant is correct only. Please excuse for that I am new to solr. :-(

Con, have a read here :

http://www.ibm.com/developerworks/java/library/j-solr1/

it helped me pick up the basics a while back. it refers to 1.2, but the core 
concepts are relevant to 1.3 too.

b
_
{Beto|Norberto|Numard} Meijome

Hildebrant's Principle:
If you don't know where you are going,
any road will get you there.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: How to select one entity at a time?

2008-09-26 Thread Norberto Meijome
On Fri, 26 Sep 2008 02:35:18 -0700 (PDT)
con [EMAIL PROTECTED] wrote:

 What you meant is correct only. Please excuse for that I am new to solr. :-(

hi Con,
nothing to be excused for..but you may want to read the wiki , as it provides
quite a lot of information that should answer your questions. DIH is great, but
I wouldn't go near it until you understand how to create your own schema.xml
and solrconfig.xml .

http://wiki.apache.org/solr/FrontPage is the wiki

( everyone else ... is there a guide on getting started on SOLR ? step by step,
taking the example and changing it for your own use?  )

 I want to index all the query results. (I think this will be done by the
 data-config.xml) 

hmm...terminology :-) 
you index documents (similar to records in a database).

when you send a query to Solr, you will get results if your query 

 Now while accessing this indexed data, i need this filtering. ie. Either
 user or manager.
 I tried your suggestion:
 http://localhost:8983/solr/select/?q=user:bobversion=2.2start=0rows=10indent=onwt=json

the url LOOKS ok. do you have any document in your index with field user
containing 'bob; ? 

try this to get all results ( xml format, first 3 results only...

http://localhost:8983/solr/select/?q=*:*rows=3

then, find a field with a value , then search for that value and see if you get
that document back - it should work...(with lots of caveats, yes)..

If you send us the result we can help u understand better why it isn't
working as you intend..
b
_
{Beto|Norberto|Numard} Meijome

First they ignore you, then they laugh at you, then they fight you, then you
win. Mahatma Gandhi.

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Dismax , query phrases

2008-09-25 Thread Norberto Meijome
On Wed, 24 Sep 2008 08:34:57 -0700 (PDT)
Otis Gospodnetic [EMAIL PROTECTED] wrote:

 What happens if you change ps from 100 to 1 and comment out that ord function?
 
 

Otis, I think what I am after is what Hoss described in his last paragraph in 
his reply to your email last year :

http://www.nabble.com/DisMax-and-REQUIRED-OR-REQUIRED-query-rewrite-td13395349.html#a13395349

ie, I want everything that Dismax does, BUT , on certain fields, I want it to 
search for all the terms in my q= , as a phrase.

I am thinking of modifying dismax to allow this to be passed as a configuration 
( eg, fieldsSearchExact=artist_exact, title_exact), but if I can avoid it 
that'd be great :).

any other ideas, anyone??

thanks!
B
_
{Beto|Norberto|Numard} Meijome

Nature doesn't care how smart you are. You can still be wrong.
  Richard Feynman

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Shingles , min size?

2008-09-25 Thread Norberto Meijome
hi guys,
I may have missed it ,but is it possible to tell the solr.ShingleFilterFactory 
the minimum number of grams to generate per shingle?  Similar to 
NGramTokenizerFactory's minGramSize=3 maxGramSize=3 

thanks!
B
_
{Beto|Norberto|Numard} Meijome

Ask not what's inside your head, but what your head's inside of.
   J. J. Gibson

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Using Shingles to Increase Phrase Search Performance

2008-09-24 Thread Norberto Meijome
On Sat, 16 Aug 2008 15:39:44 -0700
Chris Harris [EMAIL PROTECTED] wrote:

[...]
 So finally I modified the Lucene ShingleFilter class to add an
 outputUnigramIfNoNgram option. Basically, if you set that option,
 and also set outputUnigrams=false, then the filter will tokenize just
 as in Exhibit B, except that if the query is only one word long, it
 will return a corresponding single token, rather than zero tokens. In
 other words,
 
 [Exhibit C]
 please -
   please
 
 Things were still zippy. And, so far, I think I have seriously
 improved my phrase search performance without ruining anything.

hi Chris,
 is this change part of 1.3 ? 

I've tried 
fieldType name=shingle4_mark2 class=solr.TextField
analyzer
tokenizer 
class=solr.StandardTokenizerFactory /
filter class=solr.ShingleFilterFactory
maxShingleSize=4 
outputUnigrams=false outputUnigramIfNoNgram=true /
filter class=solr.LowerCaseFilterFactory /
/analyzer
/fieldType


but analysis.jsp shows no tokens generated when there is only 1 word. 

thanks!
B

_
{Beto|Norberto|Numard} Meijome

 I sense much NT in you.
 NT leads to Bluescreen.
 Bluescreen leads to downtime.
 Downtime leads to suffering.
 NT is the path to the darkside.
 Powerful Unix is.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Dismax , query phrases

2008-09-24 Thread Norberto Meijome
Hello,
I've seen references to this in the list, but not completely explained...my
apologies if this is FAQ (and for the length of the email).

I am using dismax across a number of fields on an index with data about music
albums  songs - the fields are quite full of stop words. I am trying to boost
'exact' matches - ie, if you search for 'The Doors', those documents with 'The
Doors' should be first. I've created the following fieldType and I use it for 
fields artist_exact and title_exact:


fieldType name=lowerCaseString class=solr.TextField
sortMissingLast=true omitNorms=true
analyzer
!-- KeywordTokenizer does no actual
tokenizing, so the entire input string is preserved as a single token
--
tokenizer
class=solr.KeywordTokenizerFactory / !-- The LowerCase TokenFilter does
what you expect, which can be when you want your sorting to be case insensitive
--
filter class=solr.LowerCaseFilterFactory /
!-- The TrimFilter removes any leading or
trailing whitespace -- filter class=solr.TrimFilterFactory /

/analyzer
/fieldType

I then give artist_exact and title_exact pretty high boosts ( title_exact^200.0
artist_exact^100.0 )

Now, when I search with ?q=the doors , all the terms in my q= aren't used
together to build the dismaxQuery , so I never get a match on the _exact fields:

(there are a few other fields involved...pretty self explanatory)

str name=rawquerystringthe doors/str
str name=querystringthe doors/str
___
str name=parsedquery
+((DisjunctionMaxQuery((title_ngram2:th he^0.1 | artist_ngram2:th he^0.1 |
title_ngram3:the^4.5 | artist_ngram3:the^3.5 | artist_exact:the^100.0 |
title_exact:the^200.0)~0.01) DisjunctionMaxQuery((genre:door^0.2 |
title_ngram2:do oo or rs^0.1 | artist_ngram2:do oo or rs^0.1 |
title_ngram3:doo oor ors^4.5 | title:door^6.0 | artist_ngram3:doo oor
ors^3.5 | artist:door^4.0 | artist_exact:doors^100.0 |
title_exact:doors^200.0)~0.01))~2) DisjunctionMaxQuery((title:door^2.0 |
artist:door^0.8)~0.01) FunctionQuery((ord(release_year))^0.5) /str

str name=parsedquery_toString +(((title_ngram2:th he^0.1 |
artist_ngram2:th he^0.1 | title_ngram3:the^4.5 | artist_ngram3:the^3.5 |
artist_exact:the^100.0 | title_exact:the^200.0)~0.01 (genre:door^0.2 |
title_ngram2:do oo or rs^0.1 | artist_ngram2:do oo or rs^0.1 |
title_ngram3:doo oor ors^4.5 | title:door^6.0 | artist_ngram3:doo oor
ors^3.5 | artist:door^4.0 | artist_exact:doors^100.0 |
title_exact:doors^200.0)~0.01)~2) (title:door^2.0 | artist:door^0.8)~0.01
(ord(release_year))^0.5


but, if I build my search as ?q=the doors 

str name=parsedquery
+DisjunctionMaxQuery((genre:door^0.2 | title_ngram2:th he e   d do oo or
rs^0.1 | artist_ngram2:th he e   d do oo or rs^0.1 | title_ngram3:the he  e
d  do doo oor ors^4.5 | title:door^6.0 | artist_ngram3:the he  e d  do doo
oor ors^3.5 | artist:door^4.0 | artist_exact:the doors^100.0 | title_exact:the
doors^200.0)~0.01) DisjunctionMaxQuery((title:door^2.0 | artist:door^0.8)~0.01)
FunctionQuery((ord(release_year))^0.5) /str

str name=parsedquery_toString +(genre:door^0.2 | title_ngram2:th he e   d
do oo or rs^0.1 | artist_ngram2:th he e   d do oo or rs^0.1 |
title_ngram3:the he  e d  do doo oor ors^4.5 | title:door^6.0 |
artist_ngram3:the he  e d  do doo oor ors^3.5 | artist:door^4.0 |
artist_exact:the doors^100.0 | title_exact:the doors^200.0)~0.01
(title:door^2.0 | artist:door^0.8)~0.01 (ord(release_year))^0.5

I've tried with other queries that don't include stopwords (smashing pumpkins,
for example), and in all cases, if I don't use  , only the LAST word is used
with my _exact fields ( tried with 1, 2 and 3 words, always the same against my
_exact fields..)

What is the reason for this behaviour? 

my full dismax config is :

str name=mm2-1 5-2 690%/str
str name=spellchecktrue/str
str name=spellcheck.extendedResultstrue/str
str name=tie0.01/str
str name=qf
title_exact^200.0 artist_exact^100.0 title^6.0 title_ngram3^4.5 artist^4.0
artist_ngram3^3.5 title_ngram2^0.1 artist_ngram2^0.1 genre^0.2 /str
str name=q.alt*:*/str
str name=spellcheck.collatetrue/str
str name=defTypedismax/str
str name=spellcheck.onlyMorePopulartrue/str
str name=rows10/str
str name=pftitle^2.0 artist^0.8/str
str name=echoParamsall/str
str name=fl*,score/str
str name=bford(release_year)^0.5/str
str name=spellcheck.count1/str
str name=ps100/str
/lst

TIA!
B
_
{Beto|Norberto|Numard} Meijome

Never offend people with style when you can offend them with substance.
  Sam Brown

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: help required: how to design a large scale solr system

2008-09-24 Thread Norberto Meijome
On Wed, 24 Sep 2008 07:46:57 -0400
Mark Miller [EMAIL PROTECTED] wrote:

 Yes. You will def see a speed increasing by avoiding http (especially 
 doc at a time http) and using the direct csv loader.
 
 http://wiki.apache.org/solr/UpdateCSV

and the obvious reason that if, for whatever reason, something breaks while you
are indexing directly from memory, can you restart the import? it may be just
easier to keep in disk and keep track of where you are up to adding to the
index...
B
_
{Beto|Norberto|Numard} Meijome

Sysadmins can't be sued for malpractice, but surgeons don't have to deal with
patients who install new versions of their own innards.

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Dismax , query phrases

2008-09-24 Thread Norberto Meijome
On Wed, 24 Sep 2008 08:34:57 -0700 (PDT)
Otis Gospodnetic [EMAIL PROTECTED] wrote:

 What happens if you change ps from 100 to 1 and comment out that ord function?
 
 
 Otis

Hi Otis,

no luck - without   :
str name=rawquerystringsmashing pumpkins/str
str name=querystringsmashing pumpkins/str
str name=parsedquery
+((DisjunctionMaxQuery((genre:smash^0.2 | title_ngram2:sm ma as sh hi in 
ng^0.1 | artist_ngram2:sm ma as sh hi in ng^0.1 | title_ngram3:sma mas ash 
shi hin ing^4.5 | title:smash^6.0 | artist_ngram3:sma mas ash shi hin 
ing^3.5 | artist:smash^4.0 | artist_exact:smashing^100.0 | 
title_exact:smashing^200.0)~0.01) DisjunctionMaxQuery((genre:pumpkin^0.2 | 
title_ngram2:pu um mp pk ki in ns^0.1 | artist_ngram2:pu um mp pk ki in 
ns^0.1 | title_ngram3:pum ump mpk pki kin ins^4.5 | title:pumpkin^6.0 | 
artist_ngram3:pum ump mpk pki kin ins^3.5 | artist:pumpkin^4.0 | 
artist_exact:pumpkins^100.0 | title_exact:pumpkins^200.0)~0.01))~2) 
DisjunctionMaxQuery((title:smash pumpkin~1^2.0 | artist:smash 
pumpkin~1^0.8)~0.01)
/str
___
str name=parsedquery_toString
+(((genre:smash^0.2 | title_ngram2:sm ma as sh hi in ng^0.1 | 
artist_ngram2:sm ma as sh hi in ng^0.1 | title_ngram3:sma mas ash shi hin 
ing^4.5 | title:smash^6.0 | artist_ngram3:sma mas ash shi hin ing^3.5 | 
artist:smash^4.0 | artist_exact:smashing^100.0 | 
title_exact:smashing^200.0)~0.01 (genre:pumpkin^0.2 | title_ngram2:pu um mp pk 
ki in ns^0.1 | artist_ngram2:pu um mp pk ki in ns^0.1 | title_ngram3:pum 
ump mpk pki kin ins^4.5 | title:pumpkin^6.0 | artist_ngram3:pum ump mpk pki 
kin ins^3.5 | artist:pumpkin^4.0 | artist_exact:pumpkins^100.0 | 
title_exact:pumpkins^200.0)~0.01)~2) (title:smash pumpkin~1^2.0 | 
artist:smash pumpkin~1^0.8)~0.01

Still OK if I include  ...

I am trying on another setup, with same data, to work with shingles rather than 
on 'exact' ... dismax seems to handle it much better...but it may be that I 
haven't added to that config all the ngram3 ngram3 fields for substring 
matching...

the resulting params were :

str name=mm2-1 5-2 690%/str
str name=spellchecktrue/str
str name=spellcheck.extendedResultstrue/str
str name=tie0.01/str
str name=trstore_albums.xsl/str
___
str name=qf
title_exact^200.0 artist_exact^100.0 title^6.0 title_ngram3^4.5 artist^4.0 
artist_ngram3^3.5 title_ngram2^0.1 artist_ngram2^0.1 genre^0.2
/str
str name=q.alt*:*/str
str name=spellcheck.collatetrue/str
str name=wtxml/str
str name=defTypedismax/str
str name=rows10/str
str name=spellcheck.onlyMorePopulartrue/str
str name=pftitle^2.0 artist^0.8/str
str name=echoParamsall/str
str name=fl*,score/str
str name=spellcheck.count1/str
str name=ps1/str
str name=debugQuerytrue/str
str name=echoParamsall/str
str name=wtxml/str
str name=qsmashing pumpkins/str

thanks,
B
_
{Beto|Norberto|Numard} Meijome

Don't remember what you can infer.
   Harry Tennant

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: help required: how to design a large scale solr system

2008-09-24 Thread Norberto Meijome
On Wed, 24 Sep 2008 11:45:34 -0400
Mark Miller [EMAIL PROTECTED] wrote:

 Nothing to stop you from breaking up the tsv/csv files into multiple 
 tsv/csv files.

Absolutely agreeing with you ... in one system where I implemented  SOLR, I
have a process run through the file system and lazily pick up new files as they
come in.. if something breaks (and it will,as the files are user generated in
many cases...), report it / leave it for later...move on. 

b

_
{Beto|Norberto|Numard} Meijome

I used to hate weddings; all the Grandmas would poke me and
say, You're next sonny! They stopped doing that when i
started to do it to them at funerals.

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Defining custom schema

2008-09-24 Thread Norberto Meijome
On Wed, 24 Sep 2008 04:42:42 -0700 (PDT)
con [EMAIL PROTECTED] wrote:

 In the table we will be having various column names like CUSTOMER_NAME,
 CUSTOMER_PHONE etc. If we use the default schema.xml, we have to map these
 values to some the default values like cat, features etc. this will cause
 difficulty when we need to process the output.
 Instead can we set the column name and column type dynamically to the
 schema.xml so that the output will show something like, CUSTOMER_NAME
 markrmiller/CUSTOMER_NAME

Con,
the default schema you refer to is from the example application. You should 
definitely edit it and define your own fields.

b

_
{Beto|Norberto|Numard} Meijome

In my opinion, we don't devote nearly enough scientific research to finding a 
cure for jerks. 
  Calvin

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Any way to extract most used keywords from an index (or a random set)

2008-09-22 Thread Norberto Meijome
On Mon, 22 Sep 2008 15:46:54 +0530
Jacob Singh [EMAIL PROTECTED] wrote:

 Hi,
 
 I'm trying to write a testing suite to gauge the performance of solr
 searches.  To do so, I'd like to be able to find out what keywords
 will get me search results.  Is there anyway to programaticaly do this
 with luke?  I'm trying to figure out what all it exposes, but I'm not
 seeing this.
 

Hi Jacob,
are you after something that the following URLs don't provide ? 

http://host/solr/core/admin/luke?wt=xslttr=luke.xsl 

but I actually prefer the schema browser ( 1.3 ) to see the top n terms per 
field...

b
_
{Beto|Norberto|Numard} Meijome

If it's there, and you can see it, it's real.
If it's not there, and you can see it, it's virtual.
If it's there, and you can't see it, it's transparent.
If it's not there, and you can't see it, you erased it.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Special character matching 'x' ?

2008-09-18 Thread Norberto Meijome
On Thu, 18 Sep 2008 10:53:39 +0530
Sanjay Suri [EMAIL PROTECTED] wrote:

 One of my field values has  the name R__ikk__nen  which contains a special
 characters.
 
 Strangely, as I see it anyway, it matches on the search query 'x' ?
 
 Can someone explain or point me to the solution/documentation?

hi Sanjay,
Akshay should have given you an answer for this. In a more general way, if you
want to know WHY something is matching the way it is, run the query with
debugQuery=true . There are a few pages in the wiki which explain other
debugging techniques.

b
_
{Beto|Norberto|Numard} Meijome

Ask not what's inside your head, but what your head's inside of.
   J. J. Gibson

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: about boost weight

2008-09-14 Thread Norberto Meijome
On Sat, 13 Sep 2008 16:17:12 +
zzh [EMAIL PROTECTED] wrote:

I think this is a stupid method, because the search conditions is too
 long, and the search efficiency will be low, we hope you can help me to solve
 this problem.

Hi,
IMHO,a long set of conditions doesn't make it stupid. You may not be going the
best way about it though. You may find
http://wiki.apache.org/solr/DisMaxRequestHandler an interesting and useful
read :)

B
_
{Beto|Norberto|Numard} Meijome

Quality is never an accident, it is always the result of intelligent effort.
  John Ruskin  (1819-1900)

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Regarding Indexing

2008-08-29 Thread Norberto Meijome
On Fri, 29 Aug 2008 00:31:13 -0700 (PDT)
sanraj25 [EMAIL PROTECTED] wrote:

 But still i cant maintain two index.
 please help me how to create two cores in solr

What specific problem do you have ?
B

_
{Beto|Norberto|Numard} Meijome

Always listen to experts.  They'll tell you what can't be done, and why.  
Then do it.
  Robert A. Heinlein

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Regarding Indexing

2008-08-29 Thread Norberto Meijome
On Fri, 29 Aug 2008 02:37:10 -0700 (PDT)
sanraj25 [EMAIL PROTECTED] wrote:

 I want to store two independent datas in solr index. so I decided to create
 two index.But that's not possible.so  i go for multicore concept in solr
 .can u give me step by step procedure to create multicore in solr

Hi,
without specific questions, i doubt myself or others can give you any other
information than the documentation, which can be found at :

http://wiki.apache.org/solr/CoreAdmin

Please make sure you are using (a recent version of ) 1.3.

B
_
{Beto|Norberto|Numard} Meijome

Your reasoning is excellent -- it's only your basic assumptions that are wrong.

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Storing two different files

2008-08-28 Thread Norberto Meijome
On Thu, 28 Aug 2008 02:01:05 -0700 (PDT)
sanraj25 [EMAIL PROTECTED] wrote:

 I want to  index two different files in solr.(for ex)  I want to store
 two tables like, job_post and job_profile in solr. But now both are stored
 in same place in solr.when i get data from job_post, data come from
 job_profile also.So i want to maintain the data of job_post and job_profile
 separately.

hi :)
you need to have 2 separate schemas, and therefore 2 separate indexes. You
should read about MultiCore in the wiki.

B

_
{Beto|Norberto|Numard} Meijome

Unix is very simple, but it takes a genius to understand the simplicity.
   Dennis Ritchie

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Question about search suggestion

2008-08-26 Thread Norberto Meijome
On Tue, 26 Aug 2008 15:15:21 +0300
Aleksey Gogolev [EMAIL PROTECTED] wrote:

 
 Hello.
 
 I'm new to solr and I need to make a search suggest (like google
 suggestions).
 

Hi Aleksey,
please search the archives of this list for subjects containing 'autocomplete'
or 'auto-suggest'. that should give you a few ideas and starting points.

best,
B

_
{Beto|Norberto|Numard} Meijome

The more I see the less I know for sure. 
  John Lennon

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Multicore and snapshooter / snappuller

2008-08-25 Thread Norberto Meijome
On Fri, 22 Aug 2008 12:21:53 -0700
Lance Norskog [EMAIL PROTECTED] wrote:

  Apparently the ZFS (Silicon Graphics
 originally) is great for really huge files. 

hi Lance,
You may be  confusing Sun's ZFS with SGI's XFS. The OP referred, i think, to 
ZFS.

B

_
{Beto|Norberto|Numard} Meijome

The greatest dangers to liberty lurk in insidious encroachment by men of zeal, 
well-meaning but without understanding.
   Justice Louis D. Brandeis

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: dataimporthandler and mysql connector jar

2008-08-25 Thread Norberto Meijome
On Mon, 25 Aug 2008 17:11:47 +0200
Walter Ferrara [EMAIL PROTECTED] wrote:

 Launching a multicore solr with dataimporthandler using a mysql driver,
 (driver=com.mysql.jdbc.Driver) works fine if the mysql connector jar
 (mysql-connector-java-5.0.7-bin.jar) is in the classpath, either jdk
 classpath or inside the solr.war lib dir.
 While putting the mysql-connector-java-5.0.7-bin.jar in core0/lib
 directory, or in the multicore shared lib dir (specified in sharedLib
 attribute in solr.xml) result in exception, even if the jar is correctly
 loaded by the classloader:


Hi Walter,
As at nightly build of August 19th, the DIH failing to connect to the data
source on SOLR's startup does *not* kill SOLR anymore. I haven't tested
yesterday's ...it could be a regression bug, but i doubt it - the error used to
be different to yours  (about connectivity, not failure in document).

for what is worth,i only have 1 copy of the jdbc jar (MS SQL in my case), in
the SOLR's lib directory, used by several cores's own DIH. You can check if
it's picked up by SOLR's classpath in the Java Info page under admin/

You may also want to try with a valid but empty document definition in
data-config.xml to rule out syntax issues.

B
_
{Beto|Norberto|Numard} Meijome

Any society that would give up a little liberty to gain a little security will
deserve neither and lose both. Benjamin Franklin

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Querying Question

2008-08-21 Thread Norberto Meijome
On Thu, 21 Aug 2008 18:09:11 -0700
Jake Conk [EMAIL PROTECTED] wrote:

 I thought if I used copyField / to copy my string field to a text
 field then I can search for words within it and not limited to the
 entire content. Did I misunderstand that?

but you need to search on the fields that are defined as fieldType=text...it 
seems you are searching on the string fields.

B

_
{Beto|Norberto|Numard} Meijome

He has the attention span of a lightning bolt.
  Robert Redford

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: hello, a question about solr.

2008-08-20 Thread Norberto Meijome
On Wed, 20 Aug 2008 10:58:50 -0300
Alexander Ramos Jardim [EMAIL PROTECTED] wrote:

 A tiny but really explanation can be found here
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters


thanks Alexander - indeed, quite short, and focused on shingles ... which , if 
I understand correctly, are groups of terms of n size... the ngramtokizer 
creates tokens of n-characters from your input.

Searching for ngram or n-gram in the archives should bring more relevant 
information up, which isnt in the wiki yet.

B

_
{Beto|Norberto|Numard} Meijome

All that is necessary for the triumph of evil is that good men do nothing.
  Edmund Burke

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: hello, a question about solr.

2008-08-18 Thread Norberto Meijome
On Mon, 18 Aug 2008 15:33:02 +0800
finy finy [EMAIL PROTECTED] wrote:

 the name field is text,which is analysed, i use the query
 name:ibmT63notebook

why do you search with no spaces? is this free text entered by a user, or is it 
part of a link which you control ?

PS: please dont top-post

_
{Beto|Norberto|Numard} Meijome

Commitment is active, not passive. Commitment is doing whatever you can to 
bring about the desired result. Anything less is half-hearted.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


.wsdl for example....

2008-08-18 Thread Norberto Meijome
hi :)

does anyone have a .wsdl definition for the example bundled with SOLR? 

if nobody has it, would it be useful to have one ?

cheers,
B
_
{Beto|Norberto|Numard} Meijome

Intelligence: Finding an error in a Knuth text.
Stupidity: Cashing that $2.56 check you got.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: hello, a question about solr.

2008-08-18 Thread Norberto Meijome
On Mon, 18 Aug 2008 23:07:19 +0800
finy finy [EMAIL PROTECTED] wrote:

 because i use chinese character, for example ibm___
 solr will parse it into a term ibm and a phraze _ __
 can i use solr to query with a term ibm and a term _  and a term 
 __?

Hi finy,
you should look into n-gram tokenizers. Not sure if it is documented in the 
wiki, but it has been discussed in the mailing list quite a few times.

in short, an n-gram tokenizer breaks your input into blocks of characters of 
size n , which are then used to compare in the index. I think for Chinese , 
bi-gram is the favoured approach.

good luck,
B
_
{Beto|Norberto|Numard} Meijome

I used to hate weddings; all the Grandmas would poke me and
say, You're next sonny! They stopped doing that when i
started to do it to them at funerals.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: .wsdl for example....

2008-08-18 Thread Norberto Meijome
On Mon, 18 Aug 2008 19:08:24 -0300
Alexander Ramos Jardim [EMAIL PROTECTED] wrote:

 Do you wanna a full web service for SOLR example? How a .wsdl will help you?
 Why don't you use the HTTP interface SOLR provides?
 
 Anyways, if you need to develop a web service (SOAP compliant) to access
 SOLR, just remember to use an embedded core on your webservice.

On Mon, 18 Aug 2008 15:37:24 -0400
Erik Hatcher [EMAIL PROTECTED] wrote:

 WSDL?   surely you jest.
 
   Erik

:D I obviously said something terribly stupid, oh well, not the first time and 
most likely wont be the last one either.

Anyway, the reason for my asking is : 
 - I've put together a SOLR search service with a few cores. Nothing fancy, it 
works great as is.
 -  the .NET developer I am working with on this  asked for a .wsdl (or .asmx) 
file to import into Visual Studio ... yes, he can access the service directly, 
but he seems to prefer a more 'well defined' interface (haven't really decided 
whether it is worth the effort, but that is another question altogether)

The way I see it, SOLR is a  RESTful service. I am not looking into wrapping 
the whole thing behind SOAP ( I actually much prefer REST than SOAP, but that 
is entering into quasi-religious grounds...) - which should be able to be 
defined with a .wsdl ( v 1.1 should suffice as only GET + POST are supported in 
SOLR anyway).

Am I missing anything here ?

thanks in advance for your time + thoughts ,
B
_
{Beto|Norberto|Numard} Meijome

He has no enemies, but is intensely disliked by his friends.
  Oscar Wilde

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: .wsdl for example....

2008-08-18 Thread Norberto Meijome
On Tue, 19 Aug 2008 11:23:48 +1000
Norberto Meijome [EMAIL PROTECTED] wrote:

 On Mon, 18 Aug 2008 19:08:24 -0300
 Alexander Ramos Jardim [EMAIL PROTECTED] wrote:
 
  Do you wanna a full web service for SOLR example? How a .wsdl will help you?
  Why don't you use the HTTP interface SOLR provides?
  
  Anyways, if you need to develop a web service (SOAP compliant) to access
  SOLR, just remember to use an embedded core on your webservice.
 
 On Mon, 18 Aug 2008 15:37:24 -0400
 Erik Hatcher [EMAIL PROTECTED] wrote:
 
  WSDL?   surely you jest.
  
  Erik
 
 :D I obviously said something terribly stupid, oh well, not the first time 
 and most likely wont be the last one either.
 
 Anyway, the reason for my asking is : 
  - I've put together a SOLR search service with a few cores. Nothing fancy, 
 it works great as is.
  -  the .NET developer I am working with on this  asked for a .wsdl (or 
 .asmx) file to import into Visual Studio ... yes, he can access the service 
 directly, but he seems to prefer a more 'well defined' interface (haven't 
 really decided whether it is worth the effort, but that is another question 
 altogether)
 
 The way I see it, SOLR is a  RESTful service. I am not looking into wrapping 
 the whole thing behind SOAP ( I actually much prefer REST than SOAP, but that 
 is entering into quasi-religious grounds...) - which should be able to be 
 defined with a .wsdl ( v 1.1 should suffice as only GET + POST are supported 
 in SOLR anyway).
 
 Am I missing anything here ?
 
 thanks in advance for your time + thoughts ,
 B

To be clear, i don't suggest we should have a .wsdl for example, simply asking 
if there would be any use in having one.

but given the responses I got, I'm curious now to understand what I have gotten 
wrong :)

Best,
B
_
{Beto|Norberto|Numard} Meijome

 I sense much NT in you.
 NT leads to Bluescreen.
 Bluescreen leads to downtime.
 Downtime leads to suffering.
 NT is the path to the darkside.
 Powerful Unix is.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Clarification on facets

2008-08-18 Thread Norberto Meijome
On Tue, 19 Aug 2008 10:18:12 +1200
Gene Campbell [EMAIL PROTECTED] wrote:

 Is this interpreted as meaning, there are 10 documents that will match
 with 'car' in the title, and likewise 6 'boat' and 2 'bike'?

Correct.

 If so, is there any way to get counts for the *number times* a value
 is found in a document.  I'm looking for a way to determine the number
 of times 'car' is repeated in the title, for example

Not sure - i would suggest that a field with a term repeated several times 
would receive a higher score when searching for that term, but not sure how you 
could get the information you seek...maybe with the Luke handler ? ( but on a 
per-document basis...slow... ? )

B
_
{Beto|Norberto|Numard} Meijome

Computers are like air conditioners; they can't do their job properly if you 
open windows.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


[SOLVED...]Re: Problems using saxon for XSLT transforms

2008-08-17 Thread Norberto Meijome
On Tue, 12 Aug 2008 23:36:32 +1000
Norberto Meijome [EMAIL PROTECTED] wrote:

 hi :)
 I'm trying to use SAXON instead of the default XSLT parser. I was pretty sure 
 i
 had it running fine on 1.2, but when I repeated the same steps (as per the
 wiki) on latest nightly build, i cannot see any sign of it being loaded or 
 use,
 although the classpath seems to be pointing to them (see below)
 
[...]

well, although no explicit information is present about whether it IS using 
saxon, it obviously dies when saxon isn't present- I moved lib/saxon* out of 
the way, and any transformation dies with :


HTTP ERROR: 500

Provider net.sf.saxon.TransformerFactoryImpl not found

javax.xml.transform.TransformerFactoryConfigurationError: Provider 
net.sf.saxon.TransformerFactoryImpl not found
at 
javax.xml.transform.TransformerFactory.newInstance(TransformerFactory.java:108)
at 
org.apache.solr.util.xslt.TransformerProvider.init(TransformerProvider.java:45)
at 
org.apache.solr.util.xslt.TransformerProvider.clinit(TransformerProvider.java:43)
at 
org.apache.solr.request.XSLTResponseWriter.getTransformer(XSLTResponseWriter.java:117)
at 
org.apache.solr.request.XSLTResponseWriter.getContentType(XSLTResponseWriter.java:65)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:250)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1088)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:360)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:729)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:206)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:324)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:505)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:829)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:211)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:380)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
at 
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:450)

RequestURI=/solr/tracks/select/


I guess not as clear as what I'd had hoped for, but should do for now :)

cheers,
B
_
{Beto|Norberto|Numard} Meijome

Computers are like air conditioners; they can't do their job properly if you 
open windows.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


DataImportHandler : more forgiving initialisation possible?

2008-08-17 Thread Norberto Meijome
hi guys,
First of all, thanks for DIH - it's great :)

One thing I noticed during my tests ( nightly, 2008-08-16) is that, if the DB 
is not available during SOLR startup time, the whole core won't initialise .- 
the error is shown below.

I was wondering,
1) would it be possible to have DIH bomb out in this situation, but not bring 
down the whole core from running?  I think it would be desirable , with a big 
warning , possibly... thoughts ?

2) How hard would it be to handle this more gracefully - for example, in case 
of error, leave the handler in an non-init state, and when being accessed, 
repeat the whole init process (and bomb out if it fails again ,of course)...

Thanks for your time on this email + DIH + all other features :)
B

[...]
Aug 17, 2008 11:25:48 PM org.apache.solr.handler.dataimport.DataImportHandler 
processConfiguration
INFO: Processing configuration from solrconfig.xml: {config=data-config.xml}
Aug 17, 2008 11:25:48 PM org.apache.solr.handler.dataimport.DataImporter 
loadDataConfig
INFO: Data Configuration loaded successfully
Aug 17, 2008 11:25:48 PM org.apache.solr.handler.dataimport.JdbcDataSource$1 
call
INFO: Creating a connection for entity an_artist with URL: 
jdbc:sqlserver://a.b.c.d:1433;databaseName=DBNAME;user=usrname;password=magicpassword;responseBuffering=adaptive;
Aug 17, 2008 11:25:48 PM org.apache.solr.handler.dataimport.DataImportHandler 
inform
SEVERE: Exception while loading DataImporter
org.apache.solr.handler.dataimport.DataImportHandlerException: Failed to 
initialize DataSource: null Processing Documemt # 
at 
org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:306)
at 
org.apache.solr.handler.dataimport.DataImporter.addDataSource(DataImporter.java:273)
at 
org.apache.solr.handler.dataimport.DataImporter.initEntity(DataImporter.java:228)
at 
org.apache.solr.handler.dataimport.DataImporter.init(DataImporter.java:98)
at 
org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:106)
at 
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:294)
at org.apache.solr.core.SolrCore.init(SolrCore.java:473)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:295)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207)
at 
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:107)
at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97)
at 
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:39)
at 
org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:593)
at org.mortbay.jetty.servlet.Context.startContext(Context.java:140)
at 
org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1220)
at 
org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:513)
at 
org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448)
at 
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:39)
at 
org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
at 
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:39)
at 
org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
at 
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:39)
at 
org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
at org.mortbay.jetty.Server.doStart(Server.java:222)
at 
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:39)
at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:977)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.mortbay.start.Main.invokeMain(Main.java:183)
at org.mortbay.start.Main.start(Main.java:497)
at org.mortbay.start.Main.main(Main.java:115)
Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: 
Unable to create database connection Processing Documemt # 
at 
org.apache.solr.handler.dataimport.JdbcDataSource.init(JdbcDataSource.java:67)
at 
org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:303)
... 34 more
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The TCP/IP 
connection to the host  has failed. java.net.ConnectException: Connection 
refused
at 

DIH - calling spellchecker rebuild...

2008-08-17 Thread Norberto Meijome
Guys + gals,

just a question of form - would DIH itself be the right place to implement a 
URLS to call after successfully completing a DIH full or partial load - for 
example, to rebuild spellchecker when new items have been added?  Or should 
that be part of my external process (cron - shell script, for example ) that 
calls DIH in the first place ?

cheers
B
_
{Beto|Norberto|Numard} Meijome

If you find a solution and become attached to it, the solution may become your 
next problem.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: DIH - calling spellchecker rebuild...

2008-08-17 Thread Norberto Meijome
On Sun, 17 Aug 2008 20:22:26 +0530
Shalin Shekhar Mangar [EMAIL PROTECTED] wrote:

 If it is only SpellCheckComponent that you are interested in, then see
 SOLR-622.
 
 You can add this to your SCC config to rebuild SCC after every commit:
 str name=buildOnCommittrue/str

ah great stuff , thanks Shalin.
B

_
{Beto|Norberto|Numard} Meijome

Truth has no special time of its own.  Its hour is now -- always.
   Albert Schweitzer

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


DIH - commit / optimize

2008-08-17 Thread Norberto Meijome
Hi again,

I see in the DIH wiki page :
[...]
full-import [..] 
commit: (default 'true'). Tells whether to commit+optimize after the operation 
[...]

but nothing for delta-import... I think it would be useful , a 'commit' 
(default=true) , 'optimize' (default=false) for the delta-import - these should 
most probably be separate ones, i think.

- for full-import , wouldn't it make sense to split commit + optimize into 2 
different options? Granted, if I do a clean=true,  i'd probably want (need!) an 
optimize... even then, optimize may be too slow / use too much memory at that 
point in time... ? ( not too sure about this argument..)

cheers,
B
_
{Beto|Norberto|Numard} Meijome

Never take Life too seriously, no one gets out alive anyway.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: DIH - commit / optimize

2008-08-17 Thread Norberto Meijome
On Mon, 18 Aug 2008 10:14:32 +0800
finy finy [EMAIL PROTECTED] wrote:

 i use solr for 3 months, and i find some question follow:

Please do not hijack mail threads.

http://en.wikipedia.org/wiki/Thread_hijacking

_
{Beto|Norberto|Numard} Meijome

Ask not what's inside your head, but what your head's inside of.
   J. J. Gibson

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: DIH - commit / optimize

2008-08-17 Thread Norberto Meijome
On Mon, 18 Aug 2008 09:34:56 +0530
Shalin Shekhar Mangar [EMAIL PROTECTED] wrote:

 Actually we have commit and optimize as separate request parameters
 defaulting to true for both full-import and delta-import. You can add a
 request parameter optimize=false for delta-import if you want to commit but
 not to optimize the index.

ah , now it makes perfect sense :) sorry, i should have checked the src myself.

thanks so much again :)
B

_
{Beto|Norberto|Numard} Meijome

What you are afraid to do is a clear indicator of the next thing you need to do.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Best way to index without diacritics

2008-08-14 Thread Norberto Meijome
( 2 in 1 reply) 
On Wed, 13 Aug 2008 09:59:21 -0700
Walter Underwood [EMAIL PROTECTED] wrote:

 Stripping accents doesn't quite work. The correct translation
 is language-dependent. In German, o-dieresis should turn into
 oe, but in English, it shoulde be o (as in co__perate or
 M__tley Cr__e). In Swedish, it should not be converted at all.

Hi Walter,
understood. This goes back to the question of language-specific field
definitions / parsers... more on this below.

 
 There are other character-to-string conversions: ae-ligature
 to ae, __ to ss, and so on. Luckily, those are independent
 of language.
 
 wunder
 
 On 8/13/08 9:16 AM, Steven A Rowe [EMAIL PROTECTED] wrote:
 
  Hi Norberto,
  
  https://issues.apache.org/jira/browse/LUCENE-1343

hi Steve,
thanks for the pointer. this is a Lucene entry... I thought the Latin-filter
was a SOLR feature? I, for one, definitely meant a SOLR filter. 

Given what Walter rightly pointed out about differences in language, I suspect
it would be a SOLR-level thing - fieldType name=textDE language=DE would
apply the filter of unicode chars to {ascii?} with the appropriate mapping for
German, etc. 

Or is this that Lucene would / should take care of ?

B
_
{Beto|Norberto|Numard} Meijome

I've dirtied my hands writing poetry, for the sake of seduction; that is,  for
the sake of a useful cause. Dostoevsky

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Best way to index without diacritics

2008-08-14 Thread Norberto Meijome
On Thu, 14 Aug 2008 11:34:47 -0400
Steven A Rowe [EMAIL PROTECTED] wrote:

[...]
 The kind of filter Walter is talking about - a generalized language-aware 
 character normalization Solr/Lucene filter - does not yet exist.  My guess is 
 that if/when it does materialize, both the Solr and the Lucene projects will 
 want to have it.  Historically, most functionality shared by Solr and Lucene 
 is eventually hosted by Lucene, since Solr has a Lucene dependency, but not 
 vice-versa.
 
 So, yes, Solr would be responsible for hosting configuration for such a 
 filter, but the responsibility for doing something with the configuration 
 would be Lucene's responsibility, assuming that Lucene would (eventually) 
 host the filter and Solr would host a factory over the filter.
 
 Steve

thanks for the thorough explanation ,Steve .
B

_
{Beto|Norberto|Numard} Meijome

Throughout the centuries there were [people] who took first steps down new 
paths armed only with their own vision.
   Ayn Rand

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Searching Questions

2008-08-13 Thread Norberto Meijome
On Tue, 12 Aug 2008 13:26:26 -0700
Jake Conk [EMAIL PROTECTED] wrote:

 1) I want to search only within a specific field, for instance
 `category`. Is there a way to do this?

of course. Please see http://wiki.apache.org/solr/SolrQuerySyntax (in 
particular, follow the link to Lucene syntax..)

 
 2) When searching for multiple results are the following identical
 since *_facet and *_facet_mv have their type's both set to string?
 
 /select?q=tag_facet:%22John+McCain%22+OR+tag_facet:%22Barack+Obama%22
 /select?q=tag_facet_mv:%22John+McCain%22+OR+tag_facet_mv:%22Barack+Obama%22

Erik H. already answered this question , in another of your emails. Check your 
mailbox or the lists archives.

 3) If I'm searching for something that is in a text field but I
 specify it as a facet string rather than a text type would it still
 search within text fields or would it just limit the search to string
 fields?

I am not sure what you mean by 'a facet string' . You facet on fields, SOLR 
automatically creates facets on those fields based on the results to your query 
. 

 4) Is there a page that will show me different querying combinations
 or can someone post some more examples?

Have you check the wiki ? which page do you suggest needs more examples? 


 5) Anyone else notice returning back the data in php (wt=phps)
 doesn't unserialize? I am using PHP 5.3 w/ a nightly copy of Solr from
 last week.

sorry, haven't used PHP + SOLR

cheers,
B
_
{Beto|Norberto|Numard} Meijome

All that is necessary for the triumph of evil is that good men do nothing.
  Edmund Burke

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Problems using saxon for XSLT transforms

2008-08-12 Thread Norberto Meijome
hi :)
I'm trying to use SAXON instead of the default XSLT parser. I was pretty sure i
had it running fine on 1.2, but when I repeated the same steps (as per the
wiki) on latest nightly build, i cannot see any sign of it being loaded or use,
although the classpath seems to be pointing to them (see below)

In my logs,i see :
INFO: created xslt: org.apache.solr.request.XSLTResponseWriter
Aug 12, 2008 11:20:07 PM org.apache.solr.request.XSLTResponseWriter init
INFO: xsltCacheLifetimeSeconds=5

which is the RH itself, then, on a hit that triggers the transform : 
Aug 12, 2008 11:21:25 PM org.apache.solr.util.xslt.TransformerProvider init
WARNING: The TransformerProvider's simplistic XSLT caching mechanism is not
appropriate for high load scenarios, unless a single XSLT transform is used and
xsltCacheLifetimeSeconds is set to a sufficiently high value.

This is where I would expect to see saxon...right?

I'm running SOLR 1.3, nightly from 2008-08-11, under FreeBSD 7 (stable), JDK
1.6.. I have 4 cores defined in this test environment. 

I start my service  with :

java -Xms64m -Xmx1024m -server
-Djavax.xml.transform.TransformerFactory=net.sf.saxon.TransformerFactoryImpl
-jar start.jar


the  /admin/get-properties.jsp shows

[]

javax.xml.transform.TransformerFactory = net.sf.saxon.TransformerFactoryImpl
java.specification.version = 1.6
[...]
java.class.path
= 
/solrhome:/solrhome/lib/saxon9-s9api.jar:/solrhome/lib/jetty-6.1.11.jar:/solrhome/lib/saxon9-jdom.jar:/solrhome/lib/saxon9-sql.jar:/solrhome/lib/servlet-api-2.5-6.1.11.jar:/solrhome/lib/saxon9-xqj.jar:/solrhome/lib/saxon9.jar:/solrhome/lib/jetty-util-6.1.11.jar:/solrhome/lib/saxon9-xom.jar:/solrhome/lib/saxon9-dom4j.jar:/solrhome/lib/saxon9-xpath.jar:/solrhome/lib/saxon9-dom.jar:/solrhome/lib/jsp-2.1/core-3.1.1.jar:/solrhome/lib/jsp-2.1/ant-1.6.5.jar:/solrhome/lib/jsp-2.1/jsp-2.1.jar:/solrhome/lib/jsp-2.1/jsp-api-2.1.jar:/solrhome/lib/management/jetty-management-6.1.11.jar:/solrhome/lib/naming/jetty-naming-6.1.11.jar:/solrhome/lib/naming/activation-1.1.jar:/solrhome/lib/naming/mail-1.4.jar:/solrhome/lib/plus/jetty-plus-6.1.11.jar:/solrhome/lib/xbean/jetty-xbean-6.1.11.jar:/solrhome/lib/annotations/geronimo-annotation_1.0_spec-1.0.jar:/solrhome/lib/annotations/jetty-annotations-6.1.11.jar:/solrhome/lib/ext/jetty-java5-threadpool-6.1.11.jar:/solrhome/lib/ext/jetty-sslengine-6
 
.1.11.jar:/solrhome/lib/ext/jetty-servlet-tester-6.1.11.jar:/solrhome/lib/ext/jetty-ajp-6.1.11.jar:/solrhome/lib/ext/jetty-setuid-6.1.11.jar:/solrhome/lib/ext/jetty-client-6.1.11.jar:/solrhome/lib/ext/jetty-html-6.1.11.jar

[...]

Any pointers to where I should check to confirm saxon is being used, or
to address the problem will be greatly appreciated.

TIA,
B
_
{Beto|Norberto|Numard} Meijome

Nature doesn't care how smart you are. You can still be wrong.
  Richard Feynman

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: adds / delete within same 'transaction'..

2008-08-12 Thread Norberto Meijome
On Tue, 12 Aug 2008 11:21:50 -0700
Mike Klaas [EMAIL PROTECTED] wrote:

  will delete happen first, and then the add, or could it be that the  
  add happens before delete, in which case i end up with no more doc  
  id=1 ?  
 
 As long as you are sending these requests on the same thread, they  
 will occur in order.
 
 -Mike

right, that is GREAT to know then :)

cheers,
b

_
{Beto|Norberto|Numard} Meijome

Life is not measured by the number of breaths we take, but by the moments that 
take our breath away.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: adds / delete within same 'transaction'..

2008-08-12 Thread Norberto Meijome
On Tue, 12 Aug 2008 20:53:12 -0400
Yonik Seeley [EMAIL PROTECTED] wrote:

 On Tue, Aug 12, 2008 at 1:48 AM, Norberto Meijome [EMAIL PROTECTED] wrote:
  What happens if I issue:
   
  deleteid1/id/delete  
  adddocid1/idnamenew/name/doc  
  commit/  
 
  will delete happen first, and then the add, or could it be that the add 
  happens before delete  
 
 Doesn't matter... it's an implementation detail.  Solr used to buffer
 deletes, and if it crashed at the right time one could get duplicates.
  Now, Lucene does the buffering of deletes (internally lucene does the
 adds first and buffers the deletes until a segment flush) and it
 should be impossible to see more than one 1 or no 1 at all.

Thanks Yonik. I wasn't asking about the specific details, but of the 
consequence. I seem to remember (incorrectly , or v1.2 only maybe) , that if 
one wanted assurances that the case above happened in the right order, one had 
to commit after the deletes, and once more after the adds. 

This not being the case, I am happy :) 

Thanks again,
B
_
{Beto|Norberto|Numard} Meijome

He has Van Gogh's ear for music.
  Billy Wilder

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Best way to index without diacritics

2008-08-12 Thread Norberto Meijome
On Tue, 12 Aug 2008 11:44:42 -0400
Steven A Rowe [EMAIL PROTECTED] wrote:

 Solr is Unicode aware.  The ISOLatin1AccentFilterFactory handles diacritics 
 for the ISO Latin-1 section of the Unicode character set.  UTF (do you mean 
 UTF-8?) is a (set of) Unicode serialization(s), and once Solr has 
 deserialized it, it is just Unicode characters (Java's in-memory UTF-16 
 representation).
 
 So as long as you're only concerned about removing diacritics from the set of 
 Unicode characters that overlaps ISO Latin-1, and not about other Unicode 
 characters, then ISOLatin1AccentFilterFactory should work for you.

hi,
do you know if anyone has implemented a similar filter using icu and mapping (a 
lot more of) UTF-8 to ascii ? 

B

_
{Beto|Norberto|Numard} Meijome

He has the attention span of a lightning bolt.
  Robert Redford

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Still no results after removing from stopwords

2008-08-11 Thread Norberto Meijome
On Sun, 10 Aug 2008 19:58:24 -0700 (PDT)
SoupErman [EMAIL PROTECTED] wrote:

 I needed to run a search with a query containing the word not, so I removed
 not from the stopwords.txt file. Which seemed to work, at least as far as
 parsing the query. It was now successfully searching for that keyword, as
 noted in the query debugger. However it isn't returning any results where
 not is in the query, which suggests not hasn't been indexed. However
 looking at the listing for a particular item, not is listed as one of the
 keywords, so it should be finding it?

Hi Michael,
did you reindex your documents after 1) changing your settings and 2) 
restarting SOLR (to allow your settings to come into effect)?

B

_
{Beto|Norberto|Numard} Meijome

Real Programmers don't comment their code. If it was hard to write, it should 
be hard to understand and even harder to modify.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: unique key

2008-08-11 Thread Norberto Meijome
On Wed, 6 Aug 2008 12:25:34 +1000
Norberto Meijome [EMAIL PROTECTED] wrote:

 On Tue, 5 Aug 2008 14:41:08 -0300
 Scott Swan [EMAIL PROTECTED] wrote:
 
  I currently have multiple documents that i would like to index but i would 
  like to combine two fields to produce the unique key.
  
  the documents either have 1 or the other fields so by combining the two 
  fields i will get a unique result.
  
  is this possible in the solr schema? 

 
 Hi Scott,
 you can't do that by the schema - you need to do it when you generate your 
 document, before posting it to SOLR.

Hi again,
after reading the DataImportHandler documentation, you could do this too with 
specific configuration in DIH itself. Of course, you have to be using DIH to 
load data into your SOLR ;)

B

_
{Beto|Norberto|Numard} Meijome

Intellectual: 'Someone who has been educated beyond his/her intelligence'
   Arthur C. Clarke, from 3001, The Final Odyssey, Sources.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Can't Delete Record

2008-08-11 Thread Norberto Meijome
On Mon, 11 Aug 2008 06:48:05 -0700 (PDT)
Vj Ali [EMAIL PROTECTED] wrote:

  i also sends coomit tag as well.

maybe you need 

commit/

instead of coomit
?


_
{Beto|Norberto|Numard} Meijome

With sufficient thrust, pigs fly just fine. However, this is not necessarily a 
good idea. It is hard to be sure where they are going to land, and it could be 
dangerous sitting under them as they fly overhead.
   [RFC1925 - section 2, subsection 3]

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


adds / delete within same 'transaction'..

2008-08-11 Thread Norberto Meijome
Hello :)

I *think* i know the answer, but i'd like to confirm :

Say I have 
docid1id/nameold/name/doc

already indexed and commited (ie, 'live' ) 

What happens if I issue:

deleteid1/id/delete
adddocid1/idnamenew/name/doc
commit/

will delete happen first, and then the add, or could it be that the add happens 
before delete, in which case i end up with no more doc id=1 ? 

thanks!!
B
_
{Beto|Norberto|Numard} Meijome

Anyone who isn't confused here doesn't really understand what's going on.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: case preserving for data but not for indexing

2008-08-07 Thread Norberto Meijome
On Wed, 6 Aug 2008 21:35:47 -0700 (PDT)
Otis Gospodnetic [EMAIL PROTECTED] wrote:

 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.StandardTokenizerFactory/
 
 
 2 Tokenizers?

i wondered about that too, but didn't have the time to test...
B

_
{Beto|Norberto|Numard} Meijome

Always listen to experts.  They'll tell you what can't be done, and why.  
Then do it.
  Robert A. Heinlein

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: HTML Standard Strip filter word boundary bug

2008-08-07 Thread Norberto Meijome
On Thu, 7 Aug 2008 00:50:59 -0700 (PDT)
matt connolly [EMAIL PROTECTED] wrote:

 Where do I file a bug report?

https://issues.apache.org/jira

thanks!
B

_
{Beto|Norberto|Numard} Meijome

Contrary to popular belief, Unix is user friendly. It just happens to be very 
selective about who it decides to make friends with.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Solr Logo thought

2008-08-06 Thread Norberto Meijome
On Tue, 05 Aug 2008 16:02:51 -0400
Stephen Weiss [EMAIL PROTECTED] wrote:

 My issue with the logos presented was they made solr look like a  
 school project instead of the powerful tool that it is.  The tricked  
 out font or whatever just usually doesn't play well with the business  
 types... they want serious-looking software.  First impressions are  
 everything.  While the fiery colors are appropriate for something  
 named Solr, you can play with that without getting silly - take a look  
 at:

couldn't agree more. current logo needs improvement, but I think it can be done
much better... In particular thinking of small icons, print,etc... 

 http://www.ascsolar.com/images/asc_solar_splash_logo.gif
 http://www.logostick.com/images/EOS_InvestmentingLogo_lg.gif
 
 (Luckily there are many businesses that do solar energy!)
 
 They have the same elements but with a certain simplicity and elegance.
 
 I know probably some people don't care if it makes the boss or client  
 happy, but, these are the kinds of seemingly insignificant things that 

Indeed - the way I see it, if you don't care either way, then you should be
happy to have a professional looking one :P

B
_
{Beto|Norberto|Numard} Meijome

Caminante no hay camino, se hace camino al andar
   Antonio Machado

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Diagnostic tools

2008-08-05 Thread Norberto Meijome
On Tue, 5 Aug 2008 11:43:44 -0500
Kashyap, Raghu [EMAIL PROTECTED] wrote:

 Hi,

Hi Kashyap,
please don't hijack topic threads.

http://en.wikipedia.org/wiki/Thread_hijacking

thanks!!
B
_
{Beto|Norberto|Numard} Meijome

Software QA is like cleaning my cat's litter box: Sift out the big chunks. Stir 
in the rest. Hope it doesn't stink.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: unique key

2008-08-05 Thread Norberto Meijome
On Tue, 5 Aug 2008 14:41:08 -0300
Scott Swan [EMAIL PROTECTED] wrote:

 I currently have multiple documents that i would like to index but i would 
 like to combine two fields to produce the unique key.
 
 the documents either have 1 or the other fields so by combining the two 
 fields i will get a unique result.
 
 is this possible in the solr schema? 
 

Hi Scott,
you can't do that by the schema - you need to do it when you generate your 
document, before posting it to SOLR.

btw, please don't hijack topic threads.

http://en.wikipedia.org/wiki/Thread_hijacking

thanks!!
B
_
{Beto|Norberto|Numard} Meijome

Law of Conservation of Perversity: 
  we can't make something simpler without making something else more complex

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Sum of one field

2008-08-05 Thread Norberto Meijome
On Tue, 05 Aug 2008 18:58:42 -0300
Leonardo Dias [EMAIL PROTECTED] wrote:

 So I'm looking for a Ferrari. CarStore says that there are 5 ads for 
 Ferrari, but one ad has 2 Ferraris being sold, the other ad has 3 
 Ferraris and all the others have 1 Ferrari each, meaning that there are 
 5 ads and 8 Ferraris. And yes, I'm doing an example with Fibonacci 
 numbers. ;)

why not create one separate document per car? It'll make it easier (for the 
client) to manage too when one of the cars is sold but not the other 4

B
_
{Beto|Norberto|Numard} Meijome

With sufficient thrust, pigs fly just fine. However, this is not necessarily a 
good idea. It is hard to be sure where they are going to land, and it could be 
dangerous sitting under them as they fly overhead.
   [RFC1925 - section 2, subsection 3]

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Solr Logo thought

2008-08-04 Thread Norberto Meijome
On Mon, 4 Aug 2008 09:29:30 -0700
Ryan McKinley [EMAIL PROTECTED] wrote:

 
  If there is a still room for new log design for Solr and the  
  community is
  open for it then I can try to come up with some proposal. Doing logo  
  for
  Mahout was really interesting experience.
 
 
 In my opinion, yes  I'd love to see more effort put towards  the  
 logo.  I have stayed out of this discussion since I don't really think  
 any of the logos under consideration are complete.  (I begged some  
 friends to do two of the three logos under consideration)  I would  
 love to refine them, but time... oooh time.

+1 

If we are going to change what we have, i'd love to see some more options , or
better quality - no offence meant , but those logos aren't really a huge
improvement or departure from the current one. 

I think whatever we change to we'll be wanting to use it for a long time.

B
_
{Beto|Norberto|Numard} Meijome

If you find a solution and become attached to it, the solution may become your
next problem.

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: solr 1.3 ??

2008-08-04 Thread Norberto Meijome
On Mon, 4 Aug 2008 21:13:09 -0700 (PDT)
Vicky_Dev [EMAIL PROTECTED] wrote:

 Can we get solr 1.3 release as soon as possible? Otherwise some interim
 release (1.2.x) containing DataImportHandler will also a good option. 
 
 Any Thoughts?


have you tried one of the nightly builds? I've been following it every so
often...sometimes there is a problem, but hardly ever... you can find a build
you are comfortable with, and it'll be far closer to the actual 1.3 when
released than 1.2 .

B

_
{Beto|Norberto|Numard} Meijome

Quantum Logic Chicken:
  The chicken is distributed probabalistically on all sides of the
  road until you observe it on the side of your course.

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: performance implications on using lots of values in fq

2008-07-24 Thread Norberto Meijome
On Wed, 23 Jul 2008 11:28:49 -0700 (PDT)
briand [EMAIL PROTECTED] wrote:

 I have documents in SOLR such that each document contains one to many points
 (latitude and longitudes).   Currently we store the multiple points for a
 given document in the db and query the db to find all of the document ids
 around a given point first.   Once we have the list of ids, we populate the
 fq with those ids and the q value and send that off to SOLR to do a search.  
 In the longest query to SOLR we're populating about 450 ids into the fq
 parameter at this time.   I was wondering if anyone knows the performance
 implications of passing so many ids into the fq and when it would
 potentially be a problem for SOLR?   Currently the query passing in 450 ids
 is not a problem at all and returns in less than a second.   Thanks. 

Hey Brian,
sorry, i can't answer your question. but I wonder if you tried Postgresql + 
PostGis extensions, and what has your experience been, compared to Lucene/SOLR.

thanks :)
b

_
{Beto|Norberto|Numard} Meijome

Computers are like air conditioners; they can't do their job properly if you 
open windows.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Duplicate content

2008-07-15 Thread Norberto Meijome
On Tue, 15 Jul 2008 13:15:41 +0530
Sunil [EMAIL PROTECTED] wrote:

 1) I don't want duplicate content.

SOLR uses the field you define as the unique field to determine whether a
document should be replaced or added. The rest of the fields are in your hands.
You could devise a setup whereby the document id is generated by hashing all
the other fields in your schema, thereby ensuring that a unique document id
means unique content (of course, for a meaning of 'uniqueness' that is
different bytes ;) )

 2) I don't want to overwrite old content with new one. 
 
 Means, if I add duplicate content in solr and the content already
 exists, the old content should not be overwritten.

before inserting a new document, query the index - if you get a result back,
then don't insert. I don't know of any other way.

b
_
{Beto|Norberto|Numard} Meijome

The real voyage of discovery consists not in seeking new landscapes, but in
having new eyes. Marcel Proust

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Duplicate content

2008-07-15 Thread Norberto Meijome
On Tue, 15 Jul 2008 10:48:14 +0200
Jarek Zgoda [EMAIL PROTECTED] wrote:

  2) I don't want to overwrite old content with new one. 
 
  Means, if I add duplicate content in solr and the content already
  exists, the old content should not be overwritten.  
  
  before inserting a new document, query the index - if you get a result back,
  then don't insert. I don't know of any other way.  
 
 This operation is not atomic, so you get a race condition here. Other
 than that, it seems fine. ;)

of course - but i am not sure you can control atomicity at the SOLR level
(yet? ;) ) for /update handler - so it'd have to either be a custom handler, or
your app being the only one accessing and controlling write access to it that
way. It definitely gets more interesting if you start adding shards ;)

_
{Beto|Norberto|Numard} Meijome

All parts should go together without forcing. You must remember that the parts
you are reassembling were disassembled by you. Therefore, if you can't get them
together again, there must be a reason. By all means, do not use hammer. IBM
maintenance manual, 1975

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Filter by Type increases search results.

2008-07-15 Thread Norberto Meijome
On Tue, 15 Jul 2008 18:07:43 +0530
Preetam Rao [EMAIL PROTECTED] wrote:

 When I say filter, I meant q=fishfq=type:idea

btw, this *seems* to only work for me with standard search handler. dismax and 
fq: dont' seem to get along nicely... but maybe, it is just late and i'm not 
testing it properly..

_
{Beto|Norberto|Numard} Meijome

Mix a little foolishness with your serious plans;
it's lovely to be silly at the right moment.
   Horace

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Wiki for 1.3

2008-07-14 Thread Norberto Meijome
On Mon, 14 Jul 2008 15:52:35 +
sundar shankar [EMAIL PROTECTED] wrote:

 Hi Hoss,
  I was talking about classes like EdgeNGramFilterFactory, 
 PatterReplaceFilterfactory etc. I didnt find these in the 1.2 Jar. Where do I 
 find wiki for these and Specific classes introduced for 1.3?

Sundar, as explained in my email on 12/July , the Wiki contains all classes. 
The ones that are 1.3 specific will say so @ the top of the page.

If you want to know what classes were introduced in 1.3, why not check out both 
trees and compare?

b

_
{Beto|Norberto|Numard} Meijome

Which is worse: ignorance or apathy?
Don't know. Don't care.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


  1   2   >