RE: data-config.xml: delta-import unclear behaviour pre/postDeleteImportQuery with clean

2011-02-01 Thread Charton, Andre
Hi Manu,

from 1.4.1 it is invoked if postImportDeleteQuery is not null and clean is 
true, see Code

...
String delQuery = e.allAttributes.get(preImportDeleteQuery);
  if (dataImporter.getStatus() == DataImporter.Status.RUNNING_DELTA_DUMP) {
cleanByQuery(delQuery, fullCleanDone);
doDelta();
delQuery = e.allAttributes.get(postImportDeleteQuery);
if (delQuery != null) {
  fullCleanDone.set(false);
  cleanByQuery(delQuery, fullCleanDone);
}
  }
...


private void cleanByQuery(String delQuery, AtomicBoolean completeCleanDone) {
delQuery = getVariableResolver().replaceTokens(delQuery);
if (requestParameters.clean) {
  if (delQuery == null  !completeCleanDone.get()) {
writer.doDeleteAll();
completeCleanDone.set(true);
  } else if (delQuery != null) {
writer.deleteByQuery(delQuery);
  }
}
  }

André



-Original Message-
From: manuel aldana [mailto:ald...@gmx.de] 
Sent: Montag, 31. Januar 2011 09:40
To: solr-user@lucene.apache.org
Subject: data-config.xml: delta-import unclear behaviour 
pre/postDeleteImportQuery with clean

I have some unclear behaviour with using clean and 
pre/postImportDeleteQuery for delta-imports. The docs under 
http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xml 
are not clear enough.

My observation is:
- preImportDeleteQuery is only executed if clean=true is set
- postImportDeleteQuery is only executed if clean=true is set
- if preImportDeleteQuery is ommitted and clean=true then the whole 
index is cleaned
= config with postImportDeleteQuery itself won't work

Is above correct?

I don't need preImportDeleteQuery only post is necessary. But to make 
post work I am doubling the post to pre so clean=true doesn't delete 
whole index. This looks a bit like a workaround as wanted behaviour.

solr version is 1.4.1

thanks.

-- 
  manuel aldana
  mail: ald...@gmx.de | man...@aldana-online.de
  blog: www.aldana-online.de



Re: SolrJ (Trunk) Invalid version or the data in not in 'javabin' format

2011-02-01 Thread Em

Hi,

sorry for the late feedback. Everything seems to be fine now.

Thank you!


Koji Sekiguchi wrote:
 
 (11/01/31 3:11), Em wrote:

 Hello list,

 I build an application that uses SolrJ to communicate with Solr.

 What did I do?
 Well, I deleted all the solrj-lib stuff from my application's
 Webcontent-directory and inserted the solrj-lib from the freshly compiled
 solr 4.0 - trunk.
 However, when trying to query Solr 4.0 it shows me a
 RuntimeException:
 Invalid version or the data in not in 'javabin' format
 
 I've just committed a small change so that you can see the version
 difference
 (I'll open the JIRA issue later because it is in maintenance now):
 
 Index: solr/src/common/org/apache/solr/common/util/JavaBinCodec.java
 ===
 --- solr/src/common/org/apache/solr/common/util/JavaBinCodec.java
 (revision 1065245)
 +++ solr/src/common/org/apache/solr/common/util/JavaBinCodec.java (working
 copy)
 @@ -96,7 +96,8 @@
   FastInputStream dis = FastInputStream.wrap(is);
   version = dis.readByte();
   if (version != VERSION) {
 -  throw new RuntimeException(Invalid version or the data in not in
 'javabin' format);
 +  throw new RuntimeException(Invalid version (expected  + VERSION +
 +  , but  + version + ) or the data in not in 'javabin'
 format);
   }
   return readVal(dis);
 }
 
 Can you try the latest trunk and see the version difference?
 
 Koji
 -- 
 http://www.rondhuit.com/en/
 
 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrJ-Trunk-Invalid-version-or-the-data-in-not-in-javabin-format-tp2384421p2396195.html
Sent from the Solr - User mailing list archive at Nabble.com.


SOLR 1.4 and Lucene 3.0.3 index problem

2011-02-01 Thread Churchill Nanje Mambe
hi guys
 I have developed a java crawler and integrated the lucene 3.0.3 API into it
so it creates a Lucene.
 now I wish to search this lucene index using solr, I tried to configure the
solrconfig.xml and schema.xml, everything seems to be fine
but then solr told me the index is corrupt but I use luke and I am able to
browse the index and perform searches and other things on it
 can someone help me which solr can wrap around a lucene 3.0.3 index ??
regards

Mambe Churchill Nanje
237 33011349,
AfroVisioN Founder, President,CEO
http://www.afrovisiongroup.com | http://mambenanje.blogspot.com
skypeID: mambenanje
www.twitter.com/mambenanje


Lucene 3.0.3 index cannot be read by Solr

2011-02-01 Thread Churchill Nanje Mambe
Hello I need help,
 I am trying to configure solr 1.4 to read my lucene 3.0.3 based index
I have but it says they are not compatible. can someone help me as I
dont know what to do

-- 
Mambe Churchill Nanje
237 33011349,
AfroVisioN Founder, President,CEO
http://www.afrovisiongroup.com | http://mambenanje.blogspot.com
skypeID: mambenanje
www.twitter.com/mambenanje


Re: Solr for noSQL

2011-02-01 Thread Steven Noels
On Tue, Feb 1, 2011 at 11:52 AM, Upayavira u...@odoko.co.uk wrote:



 Apologies if my nothing funky sounded like you weren't doing cool
 stuff.


No offense whatsoever. I think my longer reply paints a more accurate light
on what Lily means in terms of SOLR for NoSQL, and it was your reaction
who triggered this additional explanation.


 I was merely attempting to say that I very much doubt you were
 doing anything funky like putting HBase underneath Solr as a replacement
 of FSDirectory.


There are some initiatives in the context of Cassandra IIRC, as well as a
project which stores Lucene index files in HBase tables, but frankly they
seem more experimentation, and also I think the nature of how Lucene/SOLR
works + what HBase does on top of Hadoop FS somehow is in conflict with each
other. Too many layers of indirection will kill performance on every layer.



 I was trying to imply that, likely your integration with
 Solr was relatively conventional (interacting with its REST interface),



Yep. We figured that was the wiser road to walk, and leaves a clear-defined
interface and possible area of improvement against a too-low level of
integration.


 and the funky stuff that you are doing sits outside of that space.

 Hope that's a clearer (and more accurate?) attempt at what I was trying
 to say.

 Upayavira (who finds the Lily project interesting, and would love to
 find the time to play with it)


Anytime, Upayavira. Anytime! ;-)

Steven.
-- 
Steven Noels
http://outerthought.org/
Scalable Smart Data
Makers of Kauri, Daisy CMS and Lily


Re: chaning schema

2011-02-01 Thread Erick Erickson
That sounds right. You can cheat and just remove solr_home/data/index
rather than delete *:* though (you should probably do that with the Solr
instance stopped)

Make sure to remove the directory index as well.

Best
Erick

On Tue, Feb 1, 2011 at 1:27 AM, Dennis Gearon gear...@sbcglobal.net wrote:

 Anyone got a great little script for changing a schema?

 i.e., after changing:
  database,
  the view in the database for data import
  the data-config.xml file
  the schema.xml file

 I BELIEVE that I have to run:
  a delete command for the whole index *:*
  a full import and optimize

 This all sound right?

  Dennis Gearon


 Signature Warning
 
 It is always a good idea to learn from your own mistakes. It is usually a
 better
 idea to learn from others’ mistakes, so you do not have to make them
 yourself.
 from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036'


 EARTH has a Right To Life,
 otherwise we all die.




Re: Terms and termscomponent questions

2011-02-01 Thread Erick Erickson
Nope, this isn't what I'd expect. There are a couple of possibilities:
1 check out what WordDelimiterFilterFactory is doing, although
 if you're really sending spaces that's probably not it.
2 Let's see the field and fieldType definitions for the field
 in question. type=text doesn't say anything about analysis,
 and that's where I'd expect you're having trouble. In particular
 if your analysis chain uses KeywordTokenizerFactory for instance.
3 Look at the admin/schema browse page, look at your field and
 see what the actual tokens are. That'll tell you what TermsComponents
 is returning, perhaps the concatenation is happening somewhere
 else.

Bottom line: Solr will not concatenate terms like this unless you tell it
to,
so I suspect you're telling it to, you just don't realize it G...

Best
Erick

On Tue, Feb 1, 2011 at 1:33 AM, openvictor Open openvic...@gmail.comwrote:

 Dear Solr users,

 I am currently using SolR and TermsComponents to make an auto suggest for
 my
 website.

 I have a field called p_field indexed and stored with type=text in the
 schema xml. Nothing out of the usual.
 I feed to Solr a set of words separated by a coma and a space such as (for
 two documents) :

 Document 1:
 word11, word12, word13. word14

 Document 2:
 word21, word22, word23. word24


 When I use my newly designed field I get things for the prefix word1 :
 word11, word12, word13. word14 word11word12 word11word13 etc...
 Is it normal to have the concatenation of words and not only the words
 indexed ? Did I miss something about Terms ?

 Thank you very much,
 Best regards all,
 Victor



Re: chaning schema

2011-02-01 Thread Stefan Matheis
From http://wiki.apache.org/solr/DataImportHandler#Commands

 The handler exposes all its API as http requests . The following are the 
 possible operations
 [..]
 clean : (default 'true'). Tells whether to clean up the index before the 
 indexing is started

so, no need for an (additional) delete *:* or?

On Tue, Feb 1, 2011 at 2:04 PM, Erick Erickson erickerick...@gmail.com wrote:
 That sounds right. You can cheat and just remove solr_home/data/index
 rather than delete *:* though (you should probably do that with the Solr
 instance stopped)

 Make sure to remove the directory index as well.

 Best
 Erick

 On Tue, Feb 1, 2011 at 1:27 AM, Dennis Gearon gear...@sbcglobal.net wrote:

 Anyone got a great little script for changing a schema?

 i.e., after changing:
  database,
  the view in the database for data import
  the data-config.xml file
  the schema.xml file

 I BELIEVE that I have to run:
  a delete command for the whole index *:*
  a full import and optimize

 This all sound right?

  Dennis Gearon


 Signature Warning
 
 It is always a good idea to learn from your own mistakes. It is usually a
 better
 idea to learn from others’ mistakes, so you do not have to make them
 yourself.
 from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036'


 EARTH has a Right To Life,
 otherwise we all die.





escaping parenthesis in search query don't work...

2011-02-01 Thread Pierre-Yves LANDRON

Hello !I've seen that in order to search term with parenthesis=2C those have to 
be=escaped as in title:\(term\).But it doesn't seem to work - parenthesis 
are=n't taken in account.here is the field type I'm using to index these data : 
  fieldType name=text class=solr.TextField 
positionIncrementGap=100   analyzer type=index   
  tokenizer class=solr.WhitespaceTokenizerFactory/  
  !-- in this example, we will only use synonyms at query time 
  filter class=solr.SynonymFilterFactory 
synonyms=index_synonyms.txt ignoreCase=true expand=false/
  -- !-- Case insensitive stop word 
removal.enablePositionIncrements=true ensures 
that a 'gap' is left to   allow for accurate phrase 
queries.  -- filter 
class=solr.StopFilterFactory  
ignoreCase=true   
words=stopwords.txt   
enablePositionIncrements=true /  filter 
class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 
splitOnCaseChange=1/ filter 
class=solr.LowerCaseFilterFactory/   !-- filter 
class=solr.EnglishPorterFilterFactory protected=protwords.txt/ -- 
   filter class=solr.SnowballPorterFilterFactory 
language=French /   filter 
class=solr.RemoveDuplicatesTokenFilterFactory/   
/analyzer analyzer type=query 
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory   
synonyms=synonyms.txt 
ignoreCase=true   expand=true/ 
filter class=solr.StopFilterFactory  
words=stopwords.txt   
ignoreCase=true /filter 
class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 
splitOnCaseChange=1/ filter 
class=solr.LowerCaseFilterFactory/   !-- filter 
class=solr.EnglishPorterFilterFactory protected=protwords.txt/ -- 
   filter class=solr.SnowballPorterFilterFactory 
language=French /   filter 
class=solr.RemoveDuplicatesTokenFilterFactory/   
/analyzer /fieldType
How can I search parenthesis within my query ?Thanks,P. 
  

Re: SOLR 1.4 and Lucene 3.0.3 index problem

2011-02-01 Thread Estrada Groups
I have the exact opposite problem where Luke won't even load the index but Solr 
starts fine. I believe there are major differences between the two indexes that 
are causing all these issues.

Adam



On Feb 1, 2011, at 6:28 AM, Churchill Nanje Mambe 
mambena...@afrovisiongroup.com wrote:

 hi guys
 I have developed a java crawler and integrated the lucene 3.0.3 API into it
 so it creates a Lucene.
 now I wish to search this lucene index using solr, I tried to configure the
 solrconfig.xml and schema.xml, everything seems to be fine
 but then solr told me the index is corrupt but I use luke and I am able to
 browse the index and perform searches and other things on it
 can someone help me which solr can wrap around a lucene 3.0.3 index ??
 regards
 
 Mambe Churchill Nanje
 237 33011349,
 AfroVisioN Founder, President,CEO
 http://www.afrovisiongroup.com | http://mambenanje.blogspot.com
 skypeID: mambenanje
 www.twitter.com/mambenanje


Re: CUSTOM JSP FOR APACHE SOLR

2011-02-01 Thread Estrada Groups
Has anyone noticed the rails application that installs with Solr4.0? I am 
interested to hear some feedback on that one...

Adam


On Jan 31, 2011, at 4:25 PM, Paul Libbrecht p...@hoplahup.net wrote:

 Tomas,
 
 I also know velocity can be used and works well.
 I would be interested to a simpler way to have the objects of SOLR available 
 in a jsp than write a custom jsp processor as a request handler; indeed, this 
 seems to be the way solrj is expected to be used in the wiki page.
 
 Actually I migrated to velocity (which I like less than jsp) just because I 
 did not find a response to this question.
 
 paul
 
 
 Le 31 janv. 2011 à 21:53, Tomás Fernández Löbbe a écrit :
 
 Hi John, you can use whatever you want for building your application, using
 Solr on the backend (JSP included). You should find all the information you
 need on Solr's wiki page:
 http://wiki.apache.org/solr/
 
 http://wiki.apache.org/solr/including some client libraries to easy
 integrate your application with Solr:
 http://wiki.apache.org/solr/IntegratingSolr
 
 http://wiki.apache.org/solr/IntegratingSolrfor fast prototyping you could
 use Velocity:
 http://wiki.apache.org/solr/VelocityResponseWriter
 
 http://wiki.apache.org/solr/VelocityResponseWriterAnyway, I recommend you
 to start with Solr's tutorial:
 http://lucene.apache.org/solr/tutorial.html
 
 
 Good luck,
 http://lucene.apache.org/solr/tutorial.htmlTomás
 
 2011/1/31 JOHN JAIRO GÓMEZ LAVERDE jjai...@hotmail.com
 
 
 
 SOLR LUCENE
 DEVELOPERS
 
 Hi i am new to solr and i like to make a custom search page for enterprise
 users
 in JSP that takes the results of Apache Solr.
 
 - Where i can find some useful examples for that topic ?
 - Is JSP the correct approach to solve mi requirement ?
 - If not what is the best solution to build a customize search page for my
 users?
 
 Thanks
 from South America
 
 JOHN JAIRO GOMEZ LAVERDE
 Bogotá - Colombia
 
 


Re: SOLR 1.4 and Lucene 3.0.3 index problem

2011-02-01 Thread Churchill Nanje Mambe
is there any way I can change the lucene version wrapped in side solr 1.4
from lucene 2.x to lucene 3.x.
 any tutorials as I am guessing thats where the index data doesnt match.
 something I also found out is that solr 1.4 expects the index to be
luce_index_folder/index while lucene 3.x index is just the folder
lucene_index_folder
 in my case its crawl_data/ for lucene but solr 1.4 is expect
crawl_data/index and when I point to this in solrconfig.xml it auto creates
crawl_data/index

I badly need this help

Mambe Churchill Nanje
237 33011349,
AfroVisioN Founder, President,CEO
http://www.afrovisiongroup.com | http://mambenanje.blogspot.com
skypeID: mambenanje
www.twitter.com/mambenanje



On Tue, Feb 1, 2011 at 2:52 PM, Estrada Groups 
estrada.adam.gro...@gmail.com wrote:

 I have the exact opposite problem where Luke won't even load the index but
 Solr starts fine. I believe there are major differences between the two
 indexes that are causing all these issues.

 Adam



 On Feb 1, 2011, at 6:28 AM, Churchill Nanje Mambe 
 mambena...@afrovisiongroup.com wrote:

  hi guys
  I have developed a java crawler and integrated the lucene 3.0.3 API into
 it
  so it creates a Lucene.
  now I wish to search this lucene index using solr, I tried to configure
 the
  solrconfig.xml and schema.xml, everything seems to be fine
  but then solr told me the index is corrupt but I use luke and I am able
 to
  browse the index and perform searches and other things on it
  can someone help me which solr can wrap around a lucene 3.0.3 index ??
  regards
 
  Mambe Churchill Nanje
  237 33011349,
  AfroVisioN Founder, President,CEO
  http://www.afrovisiongroup.com | http://mambenanje.blogspot.com
  skypeID: mambenanje
  www.twitter.com/mambenanje



Re: escaping parenthesis in search query don't work...

2011-02-01 Thread shan2812

Hi,

I think you can search without the escape sequence as its not necessary.
Instead just try (term) and it should work.

Regards
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Http-Connection-is-hanging-while-deleteByQuery-tp2367405p2397455.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR 1.4 and Lucene 3.0.3 index problem

2011-02-01 Thread Upayavira
What problem are you trying to solve by using a Lucene 3.x index within
a Solr 1.4 system?

Upayavira

On Tue, 01 Feb 2011 14:59 +0100, Churchill Nanje Mambe
mambena...@afrovisiongroup.com wrote:
 is there any way I can change the lucene version wrapped in side solr 1.4
 from lucene 2.x to lucene 3.x.
  any tutorials as I am guessing thats where the index data doesnt match.
  something I also found out is that solr 1.4 expects the index to be
 luce_index_folder/index while lucene 3.x index is just the folder
 lucene_index_folder
  in my case its crawl_data/ for lucene but solr 1.4 is expect
 crawl_data/index and when I point to this in solrconfig.xml it auto
 creates
 crawl_data/index
 
 I badly need this help
 
 Mambe Churchill Nanje
 237 33011349,
 AfroVisioN Founder, President,CEO
 http://www.afrovisiongroup.com | http://mambenanje.blogspot.com
 skypeID: mambenanje
 www.twitter.com/mambenanje
 
 
 
 On Tue, Feb 1, 2011 at 2:52 PM, Estrada Groups 
 estrada.adam.gro...@gmail.com wrote:
 
  I have the exact opposite problem where Luke won't even load the index but
  Solr starts fine. I believe there are major differences between the two
  indexes that are causing all these issues.
 
  Adam
 
 
 
  On Feb 1, 2011, at 6:28 AM, Churchill Nanje Mambe 
  mambena...@afrovisiongroup.com wrote:
 
   hi guys
   I have developed a java crawler and integrated the lucene 3.0.3 API into
  it
   so it creates a Lucene.
   now I wish to search this lucene index using solr, I tried to configure
  the
   solrconfig.xml and schema.xml, everything seems to be fine
   but then solr told me the index is corrupt but I use luke and I am able
  to
   browse the index and perform searches and other things on it
   can someone help me which solr can wrap around a lucene 3.0.3 index ??
   regards
  
   Mambe Churchill Nanje
   237 33011349,
   AfroVisioN Founder, President,CEO
   http://www.afrovisiongroup.com | http://mambenanje.blogspot.com
   skypeID: mambenanje
   www.twitter.com/mambenanje
 
 
--- 
Enterprise Search Consultant at Sourcesense UK, 
Making Sense of Open Source



Re: SOLR 1.4 and Lucene 3.0.3 index problem

2011-02-01 Thread Churchill Nanje Mambe
am sorry
 I downloaded the solr released version as I dont know how to build solr
myself
 but I wrote my crawler with lucene 3.x
 now I need solr to search this index so I tried used the solr 1.4 I
downloaded from the site as the most recent version
 now I cant seem to read the index. I considered writing my own Servlet
RESTful API or SOAP webservice but I wish that solr can work so I dont go
through that stress of recreating what Solr already has
 so what am I to do ?
 do you have a higher version of solr that uses lucene 3.x ?? so I can
download ??
regards

Mambe Churchill Nanje
237 33011349,
AfroVisioN Founder, President,CEO
http://www.afrovisiongroup.com | http://mambenanje.blogspot.com
skypeID: mambenanje
www.twitter.com/mambenanje



On Tue, Feb 1, 2011 at 3:53 PM, Upayavira u...@odoko.co.uk wrote:

 What problem are you trying to solve by using a Lucene 3.x index within
 a Solr 1.4 system?

 Upayavira

 On Tue, 01 Feb 2011 14:59 +0100, Churchill Nanje Mambe
 mambena...@afrovisiongroup.com wrote:
  is there any way I can change the lucene version wrapped in side solr 1.4
  from lucene 2.x to lucene 3.x.
   any tutorials as I am guessing thats where the index data doesnt match.
   something I also found out is that solr 1.4 expects the index to be
  luce_index_folder/index while lucene 3.x index is just the folder
  lucene_index_folder
   in my case its crawl_data/ for lucene but solr 1.4 is expect
  crawl_data/index and when I point to this in solrconfig.xml it auto
  creates
  crawl_data/index
 
  I badly need this help
 
  Mambe Churchill Nanje
  237 33011349,
  AfroVisioN Founder, President,CEO
  http://www.afrovisiongroup.com | http://mambenanje.blogspot.com
  skypeID: mambenanje
  www.twitter.com/mambenanje
 
 
 
  On Tue, Feb 1, 2011 at 2:52 PM, Estrada Groups 
  estrada.adam.gro...@gmail.com wrote:
 
   I have the exact opposite problem where Luke won't even load the index
 but
   Solr starts fine. I believe there are major differences between the two
   indexes that are causing all these issues.
  
   Adam
  
  
  
   On Feb 1, 2011, at 6:28 AM, Churchill Nanje Mambe 
   mambena...@afrovisiongroup.com wrote:
  
hi guys
I have developed a java crawler and integrated the lucene 3.0.3 API
 into
   it
so it creates a Lucene.
now I wish to search this lucene index using solr, I tried to
 configure
   the
solrconfig.xml and schema.xml, everything seems to be fine
but then solr told me the index is corrupt but I use luke and I am
 able
   to
browse the index and perform searches and other things on it
can someone help me which solr can wrap around a lucene 3.0.3 index
 ??
regards
   
Mambe Churchill Nanje
237 33011349,
AfroVisioN Founder, President,CEO
http://www.afrovisiongroup.com | http://mambenanje.blogspot.com
skypeID: mambenanje
www.twitter.com/mambenanje
  
 
 ---
 Enterprise Search Consultant at Sourcesense UK,
 Making Sense of Open Source




Re: Terms and termscomponent questions

2011-02-01 Thread openvictor Open
Dear Erick,

Thank you for your answer, here is my fieldtype definition. I took the
standard one because I don't need a better one for this field

fieldType name=text class=solr.TextField positionIncrementGap=100
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=1 catenateNumbers=1
catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.SnowballPorterFilterFactory language=English
protected=protwords.txt/
/analyzer
analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=0 catenateNumbers=0
catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.SnowballPorterFilterFactory language=English
protected=protwords.txt/
/analyzer
/fieldType

Now my field :

field name=p_field type=text indexed=true stored=true/

But I have a doubt now... Do I really put a space between words or is it
just a coma... If I only put a coma then the whole process is going to be
impacted ? What I don't really understand is that I find the separate words,
but also their concatenation (but again in one direction only). Let me
explain : if a have man bear pig I will find :
manbearpig bearpig but never pigman or anyother combination in a
different order.

Thank you very much
Best Regards,
Victor

2011/2/1 Erick Erickson erickerick...@gmail.com

 Nope, this isn't what I'd expect. There are a couple of possibilities:
 1 check out what WordDelimiterFilterFactory is doing, although
 if you're really sending spaces that's probably not it.
 2 Let's see the field and fieldType definitions for the field
 in question. type=text doesn't say anything about analysis,
 and that's where I'd expect you're having trouble. In particular
 if your analysis chain uses KeywordTokenizerFactory for instance.
 3 Look at the admin/schema browse page, look at your field and
 see what the actual tokens are. That'll tell you what TermsComponents
 is returning, perhaps the concatenation is happening somewhere
 else.

 Bottom line: Solr will not concatenate terms like this unless you tell it
 to,
 so I suspect you're telling it to, you just don't realize it G...

 Best
 Erick

 On Tue, Feb 1, 2011 at 1:33 AM, openvictor Open openvic...@gmail.com
 wrote:

  Dear Solr users,
 
  I am currently using SolR and TermsComponents to make an auto suggest for
  my
  website.
 
  I have a field called p_field indexed and stored with type=text in the
  schema xml. Nothing out of the usual.
  I feed to Solr a set of words separated by a coma and a space such as
 (for
  two documents) :
 
  Document 1:
  word11, word12, word13. word14
 
  Document 2:
  word21, word22, word23. word24
 
 
  When I use my newly designed field I get things for the prefix word1 :
  word11, word12, word13. word14 word11word12 word11word13 etc...
  Is it normal to have the concatenation of words and not only the words
  indexed ? Did I miss something about Terms ?
 
  Thank you very much,
  Best regards all,
  Victor
 



Re: Solr for noSQL

2011-02-01 Thread openvictor Open
Hi All I don't know if it answers any of your question but if you are
interested by that check out :

Lucandra ( Cassandra + Lucene)



2011/2/1 Steven Noels stev...@outerthought.org

 On Tue, Feb 1, 2011 at 11:52 AM, Upayavira u...@odoko.co.uk wrote:


 
  Apologies if my nothing funky sounded like you weren't doing cool
  stuff.


 No offense whatsoever. I think my longer reply paints a more accurate light
 on what Lily means in terms of SOLR for NoSQL, and it was your reaction
 who triggered this additional explanation.


  I was merely attempting to say that I very much doubt you were
  doing anything funky like putting HBase underneath Solr as a replacement
  of FSDirectory.


 There are some initiatives in the context of Cassandra IIRC, as well as a
 project which stores Lucene index files in HBase tables, but frankly they
 seem more experimentation, and also I think the nature of how Lucene/SOLR
 works + what HBase does on top of Hadoop FS somehow is in conflict with
 each
 other. Too many layers of indirection will kill performance on every layer.



  I was trying to imply that, likely your integration with
  Solr was relatively conventional (interacting with its REST interface),
 


 Yep. We figured that was the wiser road to walk, and leaves a clear-defined
 interface and possible area of improvement against a too-low level of
 integration.


  and the funky stuff that you are doing sits outside of that space.
 
  Hope that's a clearer (and more accurate?) attempt at what I was trying
  to say.
 
  Upayavira (who finds the Lily project interesting, and would love to
  find the time to play with it)
 

 Anytime, Upayavira. Anytime! ;-)

 Steven.
 --
 Steven Noels
 http://outerthought.org/
 Scalable Smart Data
 Makers of Kauri, Daisy CMS and Lily



Next steps in loading plug-in

2011-02-01 Thread McGibbney, Lewis John
Hi list,

Having had a thorough look at the wiki over the weekend and doing some testing 
myself I have some additional questions regarding loading my plug-in to Solr. 
Taking the 'Old Way' to loading plug-ins, I have JARred up the relevant classes 
and added the JAR to the web app WEB-INF/lib dir. I am unsure of next steps to 
take as my plug-in has extension properties (which specify web-based OWL files 
which I wish to use whenever the plug-in is invoked). My main question would be 
where I would include these config properties? My initial thoughts are that 
they would be included within  WEB-INF/web.xml but I am unsure as to how to 
include them. I have had a good look at web.xml and think that they could be 
included as init-param's but this is solely due to my lack of knowledge in 
this situation.

Thank you

Lewis


Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education's Widening Participation Initiative of the Year 
2009 and Herald Society's Education Initiative of the Year 2009
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html


Malformed XML with exotic characters

2011-02-01 Thread Markus Jelsma
There is an issue with the XML response writer. It cannot cope with some very 
exotic characters or possibly the right-to-left writing systems. The issue can 
be reproduced by indexing the content of the home page of wikipedia as it 
contains a lot of exotic matter. The problem does not affect the JSON response 
writer.

The problem is, i am unsure whether this is a bug in Solr or that perhaps 
Firefox itself trips over.


Here's the output of the JSONResponeWriter for a query returning the home 
page:
{
 responseHeader:{
  status:0,
  QTime:1,
  params:{
fl:url,content,
indent:true,
wt:json,
q:*:*,
rows:1}},
 response:{numFound:6744,start:0,docs:[
{
 url:http://www.wikipedia.org/;,
 content:Wikipedia English The Free Encyclopedia 3 543 000+ articles 
日
本語 フリー百科事典 730 000+ 記事 Deutsch Die freie Enzyklopädie 1 181 000+ Artikel 
Español La enciclopedia libre 710 000+ artículos Français L’encyclopédie libre 
1 061 000+ articles Русский Свободная энциклопедия 654 000+ статей Italiano 
L’enciclopedia libera 768 000+ voci Português A enciclopédia livre 669 000+ 
artigos Polski Wolna encyklopedia 769 000+ haseł Nederlands De vrije 
encyclopedie 668 000+ artikelen Search  • Suchen  • Rechercher  • Szukaj  • 
Ricerca  • 検索  • Buscar  • Busca  • Zoeken  • Поиск  • Sök  • 搜尋  • Cerca  • 
Søk  • Haku  • Пошук  • Hledání  • Keresés  • Căutare  • 찾기  • Tìm kiếm  • Ara  
• Cari  • Søg  • بحث  • Serĉu  • Претрага  • Paieška  • Hľadať  • Suk  • جستجو  
• חיפוש  • Търсене  • Poišči  • Cari  • Bilnga العربية Български Català Česky 
Dansk Deutsch English Español Esperanto فارسی Français 한국어 Bahasa Indonesia 
Italiano עברית Lietuvių Magyar Bahasa Melayu Nederlands 日本語 Norsk (bokmål) 
Polski Português Română Русский Slovenčina Slovenščina Српски / Srpski Suomi 
Svenska Türkçe Українська Tiếng Việt Volapük Winaray 中文   100 000+   العربية  
• Български  • Català  • Česky  • Dansk  • Deutsch  • English  • Español  • 
Esperanto  • فارسی  • Français  • 한국어  • Bahasa Indonesia  • Italiano  • עברית  
• Lietuvių  • Magyar  • Bahasa Melayu  • Nederlands  • 日本語  • Norsk (bokmål)  
• Polski  • Português  • Русский  • Română  • Slovenčina  • Slovenščina  • 
Српски / Srpski  • Suomi  • Svenska  • Türkçe  • Українська  • Tiếng Việt  • 
Volapük  • Winaray  • 中文   10 000+   Afrikaans  • Aragonés  • Armãneashce  • 
Asturianu  • Kreyòl Ayisyen  • Azərbaycan / آذربايجان ديلی  • বাংলা  • 
Беларуская 
( Акадэмічная  • Тарашкевiца )  • বিষ্ণুপ্রিযা় মণিপুরী  • Bosanski  • 
Brezhoneg  • Чăваш  
• Cymraeg  • Eesti  • Ελληνικά  • Euskara  • Frysk  • Gaeilge  • Galego  • 
ગુજરાતી  • Հայերեն  • हिन्दी  • Hrvatski  • Ido  • Íslenska  • Basa Jawa  • 
ಕನ್ನಡ  • 
ქართული  • Kurdî / كوردی  • Latina  • Latviešu  • Lëtzebuergesch  • Lumbaart  
• Македонски  • മലയാളം  • मराठी  • नेपाल भाषा  • नेपाली  • Norsk (nynorsk)  • 
Nnapulitano  
• Occitan  • Piemontèis  • Plattdüütsch  • Ripoarisch  • Runa Simi  • شاہ مکھی 
پنجابی  • Shqip  • Sicilianu  • Simple English  • Sinugboanon  • 
Srpskohrvatski / Српскохрватски  • Basa Sunda  • Kiswahili  • Tagalog  • தமிழ்  
• తెలుగు  • ไทย  • اردو  • Walon  • Yorùbá  • 粵語  • Žemaitėška   1 000+   Bahsa 
Acèh  • Alemannisch  • አማርኛ  • Arpitan  • ܐܬܘܪܝܐ  • Avañe’ẽ  • Aymar Aru  • 
Bân-lâm-gú  • Bahasa Banjar  • Basa Banyumasan  • Башҡорт  • भोजपुरी  • Bikol 
Central  • Boarisch  • བོད་ཡིག  • Chavacano de Zamboanga  • Corsu  • Deitsch  • 
ދިވެހި  • Diné Bizaad  • Eald Englisc  • Emigliàn–Rumagnòl  • Эрзянь  • 
Estremeñu  
• Fiji Hindi  • Føroyskt  • Furlan  • Gaelg  • Gàidhlig  • 贛語  • گیلکی  • Hak-
kâ-fa / 客家話  • Хальмг  • ʻŌlelo Hawaiʻi  • Hornjoserbsce  • Ilokano  • 
Interlingua  • Interlingue  • Ирон Æвзаг  • Kapampangan  • Kaszëbsczi  • 
Kernewek  • ភាសាខ្មែរ  • Kinyarwanda  • Коми  • Кыргызча  • Ladino / לאדינו  • 
Ligure  • Limburgs  • Lingála  • lojban  • Malagasy  • Malti  • 文言  • Māori  • 
مصرى  • مازِرونی / Mäzeruni  • Монгол  • မြန်မာဘာသာ  • Nāhuatlahtōlli  • 
Nedersaksisch  • Nouormand  • Novial  • Нохчийн  • Олык Марий  • O‘zbek  • पाऴि 
 
• Pangasinán  • ਪੰਜਾਬੀ / پنجابی  • Papiamentu  • پښتو  • Picard  • Къарачай–
Малкъар  • Қазақша  • Qırımtatarca  • Rumantsch  • Русиньскый Язык  • संस्कृतम् 
 • 
Sámegiella  • Sardu  • Саха Тыла  • Scots  • Seeltersk  • සිංහල  • Ślůnski  • 
Af 
Soomaali  • کوردی  • Tarandíne  • Татарча / Tatarça  • Тоҷикӣ  • Lea faka-
Tonga  • Türkmen  • Удмурт  • ᨅᨔ ᨕᨙᨁᨗ  • Uyghur / ئۇيغۇرچه  • Vèneto  • Võro  • 
West-Vlams  • Wolof  • 吴语  • ייִדיש  • Zazaki   100+   Akan  • Аҧсуа  • Авар  • 
Bamanankan  • Bislama  • Буряад  • Chamoru  • Chichewa  • Cuengh  • 
Dolnoserbski  • Eʋegbe  • Frasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • 
  • Hausa / هَوُسَا  • Igbo  • ᐃᓄᒃᑎᑐᑦ / Inuktitut  • Iñupiak  • 
Kalaallisut  • कश्मीरी / كشميري  • Kongo  • Кырык Мары  • ພາສາລາວ  • Лакку  • 
Luganda  • Mìng-dĕ̤ng-ngṳ̄  • Mirandés  • Мокшень  • Молдовеняскэ  • Na Vosa 
Vaka-Viti  • Dorerin Naoero  • Nēhiyawēwin / ᓀᐦᐃᔭᐍᐏᐣ  • Norfuk / Pitkern  • 

Re: Malformed XML with exotic characters

2011-02-01 Thread Stefan Matheis
Hi Markus,

to verify that it's not an Firefox-Issue, try xmllint on your shell to
check the given xml?

Regards
Stefan

On Tue, Feb 1, 2011 at 4:43 PM, Markus Jelsma
markus.jel...@openindex.io wrote:
 There is an issue with the XML response writer. It cannot cope with some very
 exotic characters or possibly the right-to-left writing systems. The issue can
 be reproduced by indexing the content of the home page of wikipedia as it
 contains a lot of exotic matter. The problem does not affect the JSON response
 writer.

 The problem is, i am unsure whether this is a bug in Solr or that perhaps
 Firefox itself trips over.


 Here's the output of the JSONResponeWriter for a query returning the home
 page:
 {
  responseHeader:{
  status:0,
  QTime:1,
  params:{
        fl:url,content,
        indent:true,
        wt:json,
        q:*:*,
        rows:1}},
  response:{numFound:6744,start:0,docs:[
        {
         url:http://www.wikipedia.org/;,
         content:Wikipedia English The Free Encyclopedia 3 543 000+ 
 articles 日
 本語 フリー百科事典 730 000+ 記事 Deutsch Die freie Enzyklopädie 1 181 000+ Artikel
 Español La enciclopedia libre 710 000+ artículos Français L’encyclopédie libre
 1 061 000+ articles Русский Свободная энциклопедия 654 000+ статей Italiano
 L’enciclopedia libera 768 000+ voci Português A enciclopédia livre 669 000+
 artigos Polski Wolna encyklopedia 769 000+ haseł Nederlands De vrije
 encyclopedie 668 000+ artikelen Search  • Suchen  • Rechercher  • Szukaj  •
 Ricerca  • 検索  • Buscar  • Busca  • Zoeken  • Поиск  • Sök  • 搜尋  • Cerca  •
 Søk  • Haku  • Пошук  • Hledání  • Keresés  • Căutare  • 찾기  • Tìm kiếm  • Ara
 • Cari  • Søg  • بحث  • Serĉu  • Претрага  • Paieška  • Hľadať  • Suk  • جستجو
 • חיפוש  • Търсене  • Poišči  • Cari  • Bilnga العربية Български Català Česky
 Dansk Deutsch English Español Esperanto فارسی Français 한국어 Bahasa Indonesia
 Italiano עברית Lietuvių Magyar Bahasa Melayu Nederlands 日本語 Norsk (bokmål)
 Polski Português Română Русский Slovenčina Slovenščina Српски / Srpski Suomi
 Svenska Türkçe Українська Tiếng Việt Volapük Winaray 中文   100 000+   العربية
 • Български  • Català  • Česky  • Dansk  • Deutsch  • English  • Español  •
 Esperanto  • فارسی  • Français  • 한국어  • Bahasa Indonesia  • Italiano  • עברית
 • Lietuvių  • Magyar  • Bahasa Melayu  • Nederlands  • 日本語  • Norsk (bokmål)
 • Polski  • Português  • Русский  • Română  • Slovenčina  • Slovenščina  •
 Српски / Srpski  • Suomi  • Svenska  • Türkçe  • Українська  • Tiếng Việt  •
 Volapük  • Winaray  • 中文   10 000+   Afrikaans  • Aragonés  • Armãneashce  •
 Asturianu  • Kreyòl Ayisyen  • Azərbaycan / آذربايجان ديلی  • বাংলা  • 
 Беларуская
 ( Акадэмічная  • Тарашкевiца )  • বিষ্ণুপ্রিযা় মণিপুরী  • Bosanski  • 
 Brezhoneg  • Чăваш
 • Cymraeg  • Eesti  • Ελληνικά  • Euskara  • Frysk  • Gaeilge  • Galego  •
 ગુજરાતી  • Հայերեն  • हिन्दी  • Hrvatski  • Ido  • Íslenska  • Basa Jawa  • 
 ಕನ್ನಡ  •
 ქართული  • Kurdî / كوردی  • Latina  • Latviešu  • Lëtzebuergesch  • Lumbaart
 • Македонски  • മലയാളം  • मराठी  • नेपाल भाषा  • नेपाली  • Norsk (nynorsk)  • 
 Nnapulitano
 • Occitan  • Piemontèis  • Plattdüütsch  • Ripoarisch  • Runa Simi  • شاہ مکھی
 پنجابی  • Shqip  • Sicilianu  • Simple English  • Sinugboanon  •
 Srpskohrvatski / Српскохрватски  • Basa Sunda  • Kiswahili  • Tagalog  • தமிழ்
 • తెలుగు  • ไทย  • اردو  • Walon  • Yorùbá  • 粵語  • Žemaitėška   1 000+   
 Bahsa
 Acèh  • Alemannisch  • አማርኛ  • Arpitan  • ܐܬܘܪܝܐ  • Avañe’ẽ  • Aymar Aru  •
 Bân-lâm-gú  • Bahasa Banjar  • Basa Banyumasan  • Башҡорт  • भोजपुरी  • Bikol
 Central  • Boarisch  • བོད་ཡིག  • Chavacano de Zamboanga  • Corsu  • Deitsch  
 •
 ދިވެހި  • Diné Bizaad  • Eald Englisc  • Emigliàn–Rumagnòl  • Эрзянь  • 
 Estremeñu
 • Fiji Hindi  • Føroyskt  • Furlan  • Gaelg  • Gàidhlig  • 贛語  • گیلکی  • Hak-
 kâ-fa / 客家話  • Хальмг  • ʻŌlelo Hawaiʻi  • Hornjoserbsce  • Ilokano  •
 Interlingua  • Interlingue  • Ирон Æвзаг  • Kapampangan  • Kaszëbsczi  •
 Kernewek  • ភាសាខ្មែរ  • Kinyarwanda  • Коми  • Кыргызча  • Ladino / לאדינו  •
 Ligure  • Limburgs  • Lingála  • lojban  • Malagasy  • Malti  • 文言  • Māori  •
 مصرى  • مازِرونی / Mäzeruni  • Монгол  • မြန်မာဘာသာ  • Nāhuatlahtōlli  •
 Nedersaksisch  • Nouormand  • Novial  • Нохчийн  • Олык Марий  • O‘zbek  • 
 पाऴि
 • Pangasinán  • ਪੰਜਾਬੀ / پنجابی  • Papiamentu  • پښتو  • Picard  • Къарачай–
 Малкъар  • Қазақша  • Qırımtatarca  • Rumantsch  • Русиньскый Язык  • 
 संस्कृतम्  •
 Sámegiella  • Sardu  • Саха Тыла  • Scots  • Seeltersk  • සිංහල  • Ślůnski  • 
 Af
 Soomaali  • کوردی  • Tarandíne  • Татарча / Tatarça  • Тоҷикӣ  • Lea faka-
 Tonga  • Türkmen  • Удмурт  • ᨅᨔ ᨕᨙᨁᨗ  • Uyghur / ئۇيغۇرچه  • Vèneto  • Võro  
 •
 West-Vlams  • Wolof  • 吴语  • ייִדיש  • Zazaki   100+   Akan  • Аҧсуа  • Авар  
 •
 Bamanankan  • Bislama  • Буряад  • Chamoru  • Chichewa  • Cuengh  •
 Dolnoserbski  • Eʋegbe  • Frasch  • Fulfulde  • Gagauz  • Gĩkũyũ  •
   • Hausa / هَوُسَا  • Igbo  • ᐃᓄᒃᑎᑐᑦ / Inuktitut  • Iñupiak  •
 Kalaallisut  • 

Re: SOLR 1.4 and Lucene 3.0.3 index problem

2011-02-01 Thread Koji Sekiguchi

(11/02/01 23:58), Churchill Nanje Mambe wrote:

am sorry
  I downloaded the solr released version as I dont know how to build solr
myself
  but I wrote my crawler with lucene 3.x
  now I need solr to search this index so I tried used the solr 1.4 I
downloaded from the site as the most recent version
  now I cant seem to read the index. I considered writing my own Servlet
RESTful API or SOAP webservice but I wish that solr can work so I dont go
through that stress of recreating what Solr already has
  so what am I to do ?
  do you have a higher version of solr that uses lucene 3.x ?? so I can
download ??


If I remember correctly, Lucene 2.9.4 can read Lucene 3.0 index.
So if your index is written by Lucene 3.0 program, you can use
Solr 1.4.1 with Lucene 2.9.4 libraries.

Or simply use branch_3x, it can be downloaded by using subversion:

$ svn co http://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x

Koji
--
http://www.rondhuit.com/en/


Re: SOLR 1.4 and Lucene 3.0.3 index problem

2011-02-01 Thread Peter Karich
 solr 1.4.x uses 2.9.x of lucene

you could try the trunk which uses lucene 3.0.3 and should be compatible
if I'm correct

Regards,
Peter.
 I have the exact opposite problem where Luke won't even load the index but 
 Solr starts fine. I believe there are major differences between the two 
 indexes that are causing all these issues.

 Adam



 On Feb 1, 2011, at 6:28 AM, Churchill Nanje Mambe 
 mambena...@afrovisiongroup.com wrote:

 hi guys
 I have developed a java crawler and integrated the lucene 3.0.3 API into it
 so it creates a Lucene.
 now I wish to search this lucene index using solr, I tried to configure the
 solrconfig.xml and schema.xml, everything seems to be fine
 but then solr told me the index is corrupt but I use luke and I am able to
 browse the index and perform searches and other things on it
 can someone help me which solr can wrap around a lucene 3.0.3 index ??
 regards

 Mambe Churchill Nanje
 237 33011349,
 AfroVisioN Founder, President,CEO
 http://www.afrovisiongroup.com | http://mambenanje.blogspot.com
 skypeID: mambenanje
 www.twitter.com/mambenanje


-- 
http://jetwick.com open twitter search



Re: Malformed XML with exotic characters

2011-02-01 Thread François Schiettecatte
Markus 

A few things to check, make sure whatever SOLR is hosted on is outputting utf-8 
( URIEncoding=UTF-8 in the Connector section in server.xml on Tomcat for 
example), which it looks like here, also make sure that whatever http header 
there is tells firefox that it is getting utf-8 (otherwise it defaults to 
iso-8859-1/latin-1), finally make sure that whatever font you use in firefox 
has the 'exotic' characters you are expecting. There might also be some issues 
on your platform with mixing script direction but that is probably not likely.

Cheers

François

On Feb 1, 2011, at 10:43 AM, Markus Jelsma wrote:

 There is an issue with the XML response writer. It cannot cope with some very 
 exotic characters or possibly the right-to-left writing systems. The issue 
 can 
 be reproduced by indexing the content of the home page of wikipedia as it 
 contains a lot of exotic matter. The problem does not affect the JSON 
 response 
 writer.
 
 The problem is, i am unsure whether this is a bug in Solr or that perhaps 
 Firefox itself trips over.
 
 
 Here's the output of the JSONResponeWriter for a query returning the home 
 page:
 {
 responseHeader:{
  status:0,
  QTime:1,
  params:{
   fl:url,content,
   indent:true,
   wt:json,
   q:*:*,
   rows:1}},
 response:{numFound:6744,start:0,docs:[
   {
url:http://www.wikipedia.org/;,
content:Wikipedia English The Free Encyclopedia 3 543 000+ articles 
 日
 本語 フリー百科事典 730 000+ 記事 Deutsch Die freie Enzyklopädie 1 181 000+ Artikel 
 Español La enciclopedia libre 710 000+ artículos Français L’encyclopédie 
 libre 
 1 061 000+ articles Русский Свободная энциклопедия 654 000+ статей Italiano 
 L’enciclopedia libera 768 000+ voci Português A enciclopédia livre 669 000+ 
 artigos Polski Wolna encyklopedia 769 000+ haseł Nederlands De vrije 
 encyclopedie 668 000+ artikelen Search  • Suchen  • Rechercher  • Szukaj  • 
 Ricerca  • 検索  • Buscar  • Busca  • Zoeken  • Поиск  • Sök  • 搜尋  • Cerca  • 
 Søk  • Haku  • Пошук  • Hledání  • Keresés  • Căutare  • 찾기  • Tìm kiếm  • 
 Ara  
 • Cari  • Søg  • بحث  • Serĉu  • Претрага  • Paieška  • Hľadať  • Suk  • 
 جستجو  
 • חיפוש  • Търсене  • Poišči  • Cari  • Bilnga العربية Български Català Česky 
 Dansk Deutsch English Español Esperanto فارسی Français 한국어 Bahasa Indonesia 
 Italiano עברית Lietuvių Magyar Bahasa Melayu Nederlands 日本語 Norsk (bokmål) 
 Polski Português Română Русский Slovenčina Slovenščina Српски / Srpski Suomi 
 Svenska Türkçe Українська Tiếng Việt Volapük Winaray 中文   100 000+   العربية  
 • Български  • Català  • Česky  • Dansk  • Deutsch  • English  • Español  • 
 Esperanto  • فارسی  • Français  • 한국어  • Bahasa Indonesia  • Italiano  • 
 עברית  
 • Lietuvių  • Magyar  • Bahasa Melayu  • Nederlands  • 日本語  • Norsk (bokmål)  
 • Polski  • Português  • Русский  • Română  • Slovenčina  • Slovenščina  • 
 Српски / Srpski  • Suomi  • Svenska  • Türkçe  • Українська  • Tiếng Việt  • 
 Volapük  • Winaray  • 中文   10 000+   Afrikaans  • Aragonés  • Armãneashce  • 
 Asturianu  • Kreyòl Ayisyen  • Azərbaycan / آذربايجان ديلی  • বাংলা  • 
 Беларуская 
 ( Акадэмічная  • Тарашкевiца )  • বিষ্ণুপ্রিযা় মণিপুরী  • Bosanski  • 
 Brezhoneg  • Чăваш  
 • Cymraeg  • Eesti  • Ελληνικά  • Euskara  • Frysk  • Gaeilge  • Galego  • 
 ગુજરાતી  • Հայերեն  • हिन्दी  • Hrvatski  • Ido  • Íslenska  • Basa Jawa  • 
 ಕನ್ನಡ  • 
 ქართული  • Kurdî / كوردی  • Latina  • Latviešu  • Lëtzebuergesch  • Lumbaart  
 • Македонски  • മലയാളം  • मराठी  • नेपाल भाषा  • नेपाली  • Norsk (nynorsk)  • 
 Nnapulitano  
 • Occitan  • Piemontèis  • Plattdüütsch  • Ripoarisch  • Runa Simi  • شاہ 
 مکھی 
 پنجابی  • Shqip  • Sicilianu  • Simple English  • Sinugboanon  • 
 Srpskohrvatski / Српскохрватски  • Basa Sunda  • Kiswahili  • Tagalog  • 
 தமிழ்  
 • తెలుగు  • ไทย  • اردو  • Walon  • Yorùbá  • 粵語  • Žemaitėška   1 000+   
 Bahsa 
 Acèh  • Alemannisch  • አማርኛ  • Arpitan  • ܐܬܘܪܝܐ  • Avañe’ẽ  • Aymar Aru  • 
 Bân-lâm-gú  • Bahasa Banjar  • Basa Banyumasan  • Башҡорт  • भोजपुरी  • Bikol 
 Central  • Boarisch  • བོད་ཡིག  • Chavacano de Zamboanga  • Corsu  • Deitsch  
 • 
 ދިވެހި  • Diné Bizaad  • Eald Englisc  • Emigliàn–Rumagnòl  • Эрзянь  • 
 Estremeñu  
 • Fiji Hindi  • Føroyskt  • Furlan  • Gaelg  • Gàidhlig  • 贛語  • گیلکی  • Hak-
 kâ-fa / 客家話  • Хальмг  • ʻŌlelo Hawaiʻi  • Hornjoserbsce  • Ilokano  • 
 Interlingua  • Interlingue  • Ирон Æвзаг  • Kapampangan  • Kaszëbsczi  • 
 Kernewek  • ភាសាខ្មែរ  • Kinyarwanda  • Коми  • Кыргызча  • Ladino / לאדינו  
 • 
 Ligure  • Limburgs  • Lingála  • lojban  • Malagasy  • Malti  • 文言  • Māori  
 • 
 مصرى  • مازِرونی / Mäzeruni  • Монгол  • မြန်မာဘာသာ  • Nāhuatlahtōlli  • 
 Nedersaksisch  • Nouormand  • Novial  • Нохчийн  • Олык Марий  • O‘zbek  • 
 पाऴि  
 • Pangasinán  • ਪੰਜਾਬੀ / پنجابی  • Papiamentu  • پښتو  • Picard  • Къарачай–
 Малкъар  • Қазақша  • Qırımtatarca  • Rumantsch  • Русиньскый Язык  • 
 संस्कृतम्  • 
 Sámegiella  • Sardu  • Саха Тыла  • Scots  • Seeltersk  • සිංහල  • 

Re: Malformed XML with exotic characters

2011-02-01 Thread Markus Jelsma
It's throwing out a lot of disturbing messages:

select.xml:17: parser error : Char 0xD800 out of allowed range
ki  • Eʋegbe  • Frasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • 
   ^
select.xml:17: parser error : PCDATA invalid Char value 55296
ki  • Eʋegbe  • Frasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • 
   ^
select.xml:17: parser error : Char 0xDF32 out of allowed range
 • Eʋegbe  • Frasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • �
   ^
select.xml:17: parser error : PCDATA invalid Char value 57138
 • Eʋegbe  • Frasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • �
   ^
select.xml:17: parser error : Char 0xD800 out of allowed range
�� Eʋegbe  • Frasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • ��
   ^
select.xml:17: parser error : PCDATA invalid Char value 55296
�� Eʋegbe  • Frasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • ��
   ^
select.xml:17: parser error : Char 0xDF3F out of allowed range
Eʋegbe  • Frasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • ���
   ^
select.xml:17: parser error : PCDATA invalid Char value 57151
Eʋegbe  • Frasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • ���
   ^
select.xml:17: parser error : Char 0xD800 out of allowed range
egbe  • Frasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • 
   ^
select.xml:17: parser error : PCDATA invalid Char value 55296
egbe  • Frasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • 
   ^
select.xml:17: parser error : Char 0xDF44 out of allowed range
e  • Frasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • �
   ^
select.xml:17: parser error : PCDATA invalid Char value 57156
e  • Frasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • �
   ^
select.xml:17: parser error : Char 0xD800 out of allowed range
�• Frasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • ��
   ^
select.xml:17: parser error : PCDATA invalid Char value 55296
�• Frasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • ��
   ^
select.xml:17: parser error : Char 0xDF39 out of allowed range
� Frasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • ���
   ^
select.xml:17: parser error : PCDATA invalid Char value 57145
� Frasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • ���
   ^
select.xml:17: parser error : Char 0xD800 out of allowed range
rasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • 
   ^
select.xml:17: parser error : PCDATA invalid Char value 55296
rasch  • Fulfulde  • Gagauz  • Gĩkũyũ  • 
   ^
select.xml:17: parser error : Char 0xDF43 out of allowed range
ch  • Fulfulde  • Gagauz  • Gĩkũyũ  • �
   ^
select.xml:17: parser error : PCDATA invalid Char value 57155
ch  • Fulfulde  • Gagauz  • Gĩkũyũ  • �
   ^
select.xml:17: parser error : Char 0xD800 out of allowed range
 • Fulfulde  • Gagauz  • Gĩkũyũ  • ��
   ^
select.xml:17: parser error : PCDATA invalid Char value 55296
 • Fulfulde  • Gagauz  • Gĩkũyũ  • ��
   ^
select.xml:17: parser error : Char 0xDF3A out of allowed range
�� Fulfulde  • Gagauz  • Gĩkũyũ  • ���
   ^
select.xml:17: parser error : PCDATA invalid Char value 57146
�� Fulfulde  • Gagauz  • Gĩkũyũ  • ���


On Tuesday 01 February 2011 17:00:19 Stefan Matheis wrote:
 Hi Markus,
 
 to verify that it's not an Firefox-Issue, try xmllint on your shell to
 check the given xml?
 
 Regards
 Stefan
 
 On Tue, Feb 1, 2011 at 4:43 PM, Markus Jelsma
 
 markus.jel...@openindex.io wrote:
  There is an issue with the XML response 

Re: SOLR 1.4 and Lucene 3.0.3 index problem

2011-02-01 Thread Churchill Nanje Mambe
So I should use 1.4.1, and that is already built
what if I use solr 4 ?? from the source code do you know of any tutorial I
can use to learn how to build it using netbeans IDE ??
 I already have ant installed
 or you advice I go with the 1.4.1 ??

Mambe Churchill Nanje
237 33011349,
AfroVisioN Founder, President,CEO
http://www.afrovisiongroup.com | http://mambenanje.blogspot.com
skypeID: mambenanje
www.twitter.com/mambenanje



On Tue, Feb 1, 2011 at 5:18 PM, Koji Sekiguchi k...@r.email.ne.jp wrote:

 (11/02/01 23:58), Churchill Nanje Mambe wrote:

 am sorry
  I downloaded the solr released version as I dont know how to build solr
 myself
  but I wrote my crawler with lucene 3.x
  now I need solr to search this index so I tried used the solr 1.4 I
 downloaded from the site as the most recent version
  now I cant seem to read the index. I considered writing my own Servlet
 RESTful API or SOAP webservice but I wish that solr can work so I dont go
 through that stress of recreating what Solr already has
  so what am I to do ?
  do you have a higher version of solr that uses lucene 3.x ?? so I can
 download ??


 If I remember correctly, Lucene 2.9.4 can read Lucene 3.0 index.
 So if your index is written by Lucene 3.0 program, you can use
 Solr 1.4.1 with Lucene 2.9.4 libraries.

 Or simply use branch_3x, it can be downloaded by using subversion:

 $ svn co http://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x

 Koji
 --
 http://www.rondhuit.com/en/



Re: Malformed XML with exotic characters

2011-02-01 Thread Markus Jelsma
Hi,

There is no typical encoding issues on my system. I can index, query and 
display english, german, chinese, vietnamese etc.

Cheers

On Tuesday 01 February 2011 17:23:49 François Schiettecatte wrote:
 Markus
 
 A few things to check, make sure whatever SOLR is hosted on is outputting
 utf-8 ( URIEncoding=UTF-8 in the Connector section in server.xml on
 Tomcat for example), which it looks like here, also make sure that
 whatever http header there is tells firefox that it is getting utf-8
 (otherwise it defaults to iso-8859-1/latin-1), finally make sure that
 whatever font you use in firefox has the 'exotic' characters you are
 expecting. There might also be some issues on your platform with mixing
 script direction but that is probably not likely.
 
 Cheers
 
 François
 
 On Feb 1, 2011, at 10:43 AM, Markus Jelsma wrote:
  There is an issue with the XML response writer. It cannot cope with some
  very exotic characters or possibly the right-to-left writing systems.
  The issue can be reproduced by indexing the content of the home page of
  wikipedia as it contains a lot of exotic matter. The problem does not
  affect the JSON response writer.
  
  The problem is, i am unsure whether this is a bug in Solr or that perhaps
  Firefox itself trips over.
  
  
  Here's the output of the JSONResponeWriter for a query returning the home
  page:
  {
  responseHeader:{
  
   status:0,
   QTime:1,
   params:{
   
  fl:url,content,
  indent:true,
  wt:json,
  q:*:*,
  rows:1}},
  
  response:{numFound:6744,start:0,docs:[
  
  {
  
   url:http://www.wikipedia.org/;,
   content:Wikipedia English The Free Encyclopedia 3 543 000+ articles
   日
  
  本語 フリー百科事典 730 000+ 記事 Deutsch Die freie Enzyklopädie 1 181 000+ Artikel
  Español La enciclopedia libre 710 000+ artículos Français L’encyclopédie
  libre 1 061 000+ articles Русский Свободная энциклопедия 654 000+ статей
  Italiano L’enciclopedia libera 768 000+ voci Português A enciclopédia
  livre 669 000+ artigos Polski Wolna encyklopedia 769 000+ haseł
  Nederlands De vrije encyclopedie 668 000+ artikelen Search  • Suchen  •
  Rechercher  • Szukaj  • Ricerca  • 検索  • Buscar  • Busca  • Zoeken  •
  Поиск  • Sök  • 搜尋  • Cerca  • Søk  • Haku  • Пошук  • Hledání  •
  Keresés  • Căutare  • 찾기  • Tìm kiếm  • Ara • Cari  • Søg  • بحث  •
  Serĉu  • Претрага  • Paieška  • Hľadať  • Suk  • جستجو • חיפוש  •
  Търсене  • Poišči  • Cari  • Bilnga العربية Български Català Česky Dansk
  Deutsch English Español Esperanto فارسی Français 한국어 Bahasa Indonesia
  Italiano עברית Lietuvių Magyar Bahasa Melayu Nederlands 日本語 Norsk
  (bokmål) Polski Português Română Русский Slovenčina Slovenščina Српски /
  Srpski Suomi Svenska Türkçe Українська Tiếng Việt Volapük Winaray 中文  
  100 000+   العربية • Български  • Català  • Česky  • Dansk  • Deutsch  •
  English  • Español  • Esperanto  • فارسی  • Français  • 한국어  • Bahasa
  Indonesia  • Italiano  • עברית • Lietuvių  • Magyar  • Bahasa Melayu  •
  Nederlands  • 日本語  • Norsk (bokmål) • Polski  • Português  • Русский  •
  Română  • Slovenčina  • Slovenščina  • Српски / Srpski  • Suomi  •
  Svenska  • Türkçe  • Українська  • Tiếng Việt  • Volapük  • Winaray  •
  中文   10 000+   Afrikaans  • Aragonés  • Armãneashce  • Asturianu  •
  Kreyòl Ayisyen  • Azərbaycan / آذربايجان ديلی  • বাংলা  • Беларуская (
  Акадэмічная  • Тарашкевiца )  • বিষ্ণুপ্রিযা় মণিপুরী  • Bosanski  •
  Brezhoneg  • Чăваш • Cymraeg  • Eesti  • Ελληνικά  • Euskara  • Frysk  •
  Gaeilge  • Galego  • ગુજરાતી  • Հայերեն  • हिन्दी  • Hrvatski  • Ido  •
  Íslenska  • Basa Jawa  • ಕನ್ನಡ  • ქართული  • Kurdî / كوردی  • Latina  •
  Latviešu  • Lëtzebuergesch  • Lumbaart • Македонски  • മലയാളം  • मराठी 
  • नेपाल भाषा  • नेपाली  • Norsk (nynorsk)  • Nnapulitano • Occitan  •
  Piemontèis  • Plattdüütsch  • Ripoarisch  • Runa Simi  • شاہ مکھی پنجابی
   • Shqip  • Sicilianu  • Simple English  • Sinugboanon  • Srpskohrvatski
  / Српскохрватски  • Basa Sunda  • Kiswahili  • Tagalog  • தமிழ் • తెలుగు
   • ไทย  • اردو  • Walon  • Yorùbá  • 粵語  • Žemaitėška   1 000+   Bahsa
  Acèh  • Alemannisch  • አማርኛ  • Arpitan  • ܐܬܘܪܝܐ  • Avañe’ẽ  • Aymar Aru
   • Bân-lâm-gú  • Bahasa Banjar  • Basa Banyumasan  • Башҡорт  • भोजपुरी 
  • Bikol Central  • Boarisch  • བོད་ཡིག  • Chavacano de Zamboanga  •
  Corsu  • Deitsch  • ދިވެހި  • Diné Bizaad  • Eald Englisc  •
  Emigliàn–Rumagnòl  • Эрзянь  • Estremeñu • Fiji Hindi  • Føroyskt  •
  Furlan  • Gaelg  • Gàidhlig  • 贛語  • گیلکی  • Hak- kâ-fa / 客家話  • Хальмг
   • ʻŌlelo Hawaiʻi  • Hornjoserbsce  • Ilokano  • Interlingua  •
  Interlingue  • Ирон Æвзаг  • Kapampangan  • Kaszëbsczi  • Kernewek  •
  ភាសាខ្មែរ  • Kinyarwanda  • Коми  • Кыргызча  • Ladino / לאדינו  •
  Ligure  • Limburgs  • Lingála  • lojban  • Malagasy  • Malti  • 文言  •
  Māori  • مصرى  • مازِرونی / Mäzeruni  • Монгол  • မြန်မာဘာသာ  •
  Nāhuatlahtōlli  • Nedersaksisch  • Nouormand  • Novial  • Нохчийн  •
  Олык Марий  • O‘zbek  • पाऴि • Pangasinán  • ਪੰਜਾਬੀ 

Re: chaning schema

2011-02-01 Thread Dennis Gearon
I tried removing the index directory once, and tomcat refused to sart up 
because 
it didn't have a segments file.

 


- Original Message 
From: Erick Erickson erickerick...@gmail.com
To: solr-user@lucene.apache.org
Sent: Tue, February 1, 2011 5:04:51 AM
Subject: Re: chaning schema

That sounds right. You can cheat and just remove solr_home/data/index
rather than delete *:* though (you should probably do that with the Solr
instance stopped)

Make sure to remove the directory index as well.

Best
Erick

On Tue, Feb 1, 2011 at 1:27 AM, Dennis Gearon gear...@sbcglobal.net wrote:

 Anyone got a great little script for changing a schema?

 i.e., after changing:
  database,
  the view in the database for data import
  the data-config.xml file
  the schema.xml file

 I BELIEVE that I have to run:
  a delete command for the whole index *:*
  a full import and optimize

 This all sound right?

  Dennis Gearon


 Signature Warning
 
 It is always a good idea to learn from your own mistakes. It is usually a
 better
 idea to learn from others’ mistakes, so you do not have to make them
 yourself.
 from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036'


 EARTH has a Right To Life,
 otherwise we all die.





Lock obtain timed out: NativeFSLock

2011-02-01 Thread Alex Thurlow
I recently added a second core to my solr setup, and I'm now running 
into this Lock obtain timed out error when I try to update one core 
after I've updated another core.


In my update process, I add/update 1000 documents at a time and commit 
in between.  Then at the end, I commit and optimize.  The update of the 
new core has about 150k documents.  If I try to update the old core any 
time after updating the new core (even a couple hours later), I get the 
below error.  I've tried switching to the simple lock, but that didn't 
change anything.  I've tried this on solr 1.4 and 1.4.1 both with the 
spatial-solr-2.0-RC2 plugin loaded.


If I restart solr, I can then update the old core again.

Does anyone have any insight for me here?



Feb 1, 2011 10:59:57 AM org.apache.solr.common.SolrException log
SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain 
timed out: 
NativeFSLock@./solr/data/index/lucene-088f283afa122cf05ce7eadb1b5ce07b-write.lock

at org.apache.lucene.store.Lock.obtain(Lock.java:85)
at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1545)
at 
org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1402)
at 
org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:190)
at 
org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:98)
at 
org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:173)
at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:220)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
at 
org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)

at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at 
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)

at org.mortbay.jetty.Server.handle(Server.java:285)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)

at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at 
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)





Re: chaning schema

2011-02-01 Thread Erik Hatcher
the trick is, you have to remove the data/ directory, not just the data/index 
subdirectory.  and of course then restart Solr.

or delete *:*?commit=true, depending on what's the best fit for your ops.

Erik

On Feb 1, 2011, at 11:41 , Dennis Gearon wrote:

 I tried removing the index directory once, and tomcat refused to sart up 
 because 
 it didn't have a segments file.
 
 
 
 
 - Original Message 
 From: Erick Erickson erickerick...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Tue, February 1, 2011 5:04:51 AM
 Subject: Re: chaning schema
 
 That sounds right. You can cheat and just remove solr_home/data/index
 rather than delete *:* though (you should probably do that with the Solr
 instance stopped)
 
 Make sure to remove the directory index as well.
 
 Best
 Erick
 
 On Tue, Feb 1, 2011 at 1:27 AM, Dennis Gearon gear...@sbcglobal.net wrote:
 
 Anyone got a great little script for changing a schema?
 
 i.e., after changing:
 database,
 the view in the database for data import
 the data-config.xml file
 the schema.xml file
 
 I BELIEVE that I have to run:
 a delete command for the whole index *:*
 a full import and optimize
 
 This all sound right?
 
 Dennis Gearon
 
 
 Signature Warning
 
 It is always a good idea to learn from your own mistakes. It is usually a
 better
 idea to learn from others’ mistakes, so you do not have to make them
 yourself.
 from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036'
 
 
 EARTH has a Right To Life,
 otherwise we all die.
 
 
 



Re: chaning schema

2011-02-01 Thread Dennis Gearon
Cool, thanks for the tip, Erik :-)

There's so much to learn, and I haven't even got to tuning the thing for best 
results.

 Dennis Gearon


Signature Warning

It is always a good idea to learn from your own mistakes. It is usually a 
better 
idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.



- Original Message 
From: Erik Hatcher erik.hatc...@gmail.com
To: solr-user@lucene.apache.org
Sent: Tue, February 1, 2011 9:24:24 AM
Subject: Re: chaning schema

the trick is, you have to remove the data/ directory, not just the data/index 
subdirectory.  and of course then restart Solr.

or delete *:*?commit=true, depending on what's the best fit for your ops.

Erik

On Feb 1, 2011, at 11:41 , Dennis Gearon wrote:

 I tried removing the index directory once, and tomcat refused to sart up 
because 

 it didn't have a segments file.
 
 
 
 
 - Original Message 
 From: Erick Erickson erickerick...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Tue, February 1, 2011 5:04:51 AM
 Subject: Re: chaning schema
 
 That sounds right. You can cheat and just remove solr_home/data/index
 rather than delete *:* though (you should probably do that with the Solr
 instance stopped)
 
 Make sure to remove the directory index as well.
 
 Best
 Erick
 
 On Tue, Feb 1, 2011 at 1:27 AM, Dennis Gearon gear...@sbcglobal.net wrote:
 
 Anyone got a great little script for changing a schema?
 
 i.e., after changing:
 database,
 the view in the database for data import
 the data-config.xml file
 the schema.xml file
 
 I BELIEVE that I have to run:
 a delete command for the whole index *:*
 a full import and optimize
 
 This all sound right?
 
 Dennis Gearon
 
 
 Signature Warning
 
 It is always a good idea to learn from your own mistakes. It is usually a
 better
 idea to learn from others’ mistakes, so you do not have to make them
 yourself.
 from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036'
 
 
 EARTH has a Right To Life,
 otherwise we all die.
 
 



Re: Malformed XML with exotic characters

2011-02-01 Thread Sascha Szott

Hi folks,

I've made the same observation when working with Solr's 
ExtractingRequestHandler on the command line (no browser interaction).


When issuing the following curl command

curl 
'http://mysolrhost/solr/update/extract?extractOnly=trueextractFormat=textwt=xmlresource.name=foo.pdf' 
--data-binary @foo.pdf -H 'Content-type:text/xml; charset=utf-8'  foo.xml


Solr's XML response writer returns malformed xml, e.g., xmllint gives me:

foo.xml:21: parser error : Char 0xD835 out of allowed range
foo.xml:21: parser error : PCDATA invalid Char value 55349

I'm not totally sure, if this is an Tika/PDFBox issue. However, I would 
expect in every case that the XML output produced by Solr is well-formed 
even if the libraries used under the hood return garbage.



-Sascha

p.s. I can provide the pdf file in question, if anybody would like to 
see it in action.



On 01.02.2011 16:43, Markus Jelsma wrote:

There is an issue with the XML response writer. It cannot cope with some very
exotic characters or possibly the right-to-left writing systems. The issue can
be reproduced by indexing the content of the home page of wikipedia as it
contains a lot of exotic matter. The problem does not affect the JSON response
writer.

The problem is, i am unsure whether this is a bug in Solr or that perhaps
Firefox itself trips over.


Here's the output of the JSONResponeWriter for a query returning the home
page:
{
  responseHeader:{
   status:0,
   QTime:1,
   params:{
fl:url,content,
indent:true,
wt:json,
q:*:*,
rows:1}},
  response:{numFound:6744,start:0,docs:[
{
 url:http://www.wikipedia.org/;,
 content:Wikipedia English The Free Encyclopedia 3 543 000+ articles 
日
本語 フリー百科事典 730 000+ 記事 Deutsch Die freie Enzyklopädie 1 181 000+ Artikel
Español La enciclopedia libre 710 000+ artículos Français L’encyclopédie libre
1 061 000+ articles Русский Свободная энциклопедия 654 000+ статей Italiano
L’enciclopedia libera 768 000+ voci Português A enciclopédia livre 669 000+
artigos Polski Wolna encyklopedia 769 000+ haseł Nederlands De vrije
encyclopedie 668 000+ artikelen Search  • Suchen  • Rechercher  • Szukaj  •
Ricerca  • 検索  • Buscar  • Busca  • Zoeken  • Поиск  • Sök  • 搜尋  • Cerca  •
Søk  • Haku  • Пошук  • Hledání  • Keresés  • Căutare  • 찾기  • Tìm kiếm  • Ara
• Cari  • Søg  • بحث  • Serĉu  • Претрага  • Paieška  • Hľadať  • Suk  • جستجو
• חיפוש  • Търсене  • Poišči  • Cari  • Bilnga العربية Български Català Česky
Dansk Deutsch English Español Esperanto فارسی Français 한국어 Bahasa Indonesia
Italiano עברית Lietuvių Magyar Bahasa Melayu Nederlands 日本語 Norsk (bokmål)
Polski Português Română Русский Slovenčina Slovenščina Српски / Srpski Suomi
Svenska Türkçe Українська Tiếng Việt Volapük Winaray 中文   100 000+   العربية
• Български  • Català  • Česky  • Dansk  • Deutsch  • English  • Español  •
Esperanto  • فارسی  • Français  • 한국어  • Bahasa Indonesia  • Italiano  • עברית
• Lietuvių  • Magyar  • Bahasa Melayu  • Nederlands  • 日本語  • Norsk (bokmål)
• Polski  • Português  • Русский  • Română  • Slovenčina  • Slovenščina  •
Српски / Srpski  • Suomi  • Svenska  • Türkçe  • Українська  • Tiếng Việt  •
Volapük  • Winaray  • 中文   10 000+   Afrikaans  • Aragonés  • Armãneashce  •
Asturianu  • Kreyòl Ayisyen  • Azərbaycan / آذربايجان ديلی  • বাংলা  • 
Беларуская
( Акадэмічная  • Тарашкевiца )  • বিষ্ণুপ্রিযা় মণিপুরী  • Bosanski  • 
Brezhoneg  • Чăваш
• Cymraeg  • Eesti  • Ελληνικά  • Euskara  • Frysk  • Gaeilge  • Galego  •
ગુજરાતી  • Հայերեն  • हिन्दी  • Hrvatski  • Ido  • Íslenska  • Basa Jawa  • 
ಕನ್ನಡ  •
ქართული  • Kurdî / كوردی  • Latina  • Latviešu  • Lëtzebuergesch  • Lumbaart
• Македонски  • മലയാളം  • मराठी  • नेपाल भाषा  • नेपाली  • Norsk (nynorsk)  • 
Nnapulitano
• Occitan  • Piemontèis  • Plattdüütsch  • Ripoarisch  • Runa Simi  • شاہ مکھی
پنجابی  • Shqip  • Sicilianu  • Simple English  • Sinugboanon  •
Srpskohrvatski / Српскохрватски  • Basa Sunda  • Kiswahili  • Tagalog  • தமிழ்
• తెలుగు  • ไทย  • اردو  • Walon  • Yorùbá  • 粵語  • Žemaitėška   1 000+   Bahsa
Acèh  • Alemannisch  • አማርኛ  • Arpitan  • ܐܬܘܪܝܐ  • Avañe’ẽ  • Aymar Aru  •
Bân-lâm-gú  • Bahasa Banjar  • Basa Banyumasan  • Башҡорт  • भोजपुरी  • Bikol
Central  • Boarisch  • བོད་ཡིག  • Chavacano de Zamboanga  • Corsu  • Deitsch  •
ދިވެހި  • Diné Bizaad  • Eald Englisc  • Emigliàn–Rumagnòl  • Эрзянь  • 
Estremeñu
• Fiji Hindi  • Føroyskt  • Furlan  • Gaelg  • Gàidhlig  • 贛語  • گیلکی  • Hak-
kâ-fa / 客家話  • Хальмг  • ʻŌlelo Hawaiʻi  • Hornjoserbsce  • Ilokano  •
Interlingua  • Interlingue  • Ирон Æвзаг  • Kapampangan  • Kaszëbsczi  •
Kernewek  • ភាសាខ្មែរ  • Kinyarwanda  • Коми  • Кыргызча  • Ladino / לאדינו  •
Ligure  • Limburgs  • Lingála  • lojban  • Malagasy  • Malti  • 文言  • Māori  •
مصرى  • مازِرونی / Mäzeruni  • Монгол  • မြန်မာဘာသာ  • Nāhuatlahtōlli  •
Nedersaksisch  • Nouormand  • Novial  • Нохчийн  • Олык Марий  • O‘zbek  • पाऴि
• Pangasinán  • ਪੰਜਾਬੀ / پنجابی  • Papiamentu  • پښتو  • Picard  • 

Re: one column indexed, the other isnt

2011-02-01 Thread PeterKerk

I solved it by altering my SQL statement to return a 'true' or 'false' value:
CASE WHEN c.varstatement='False' THEN 'false' ELSE 'true' END as
varstatement 

Thanks! 
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/one-column-indexed-the-other-isnt-tp2389819p2399011.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Malformed XML with exotic characters

2011-02-01 Thread Markus Jelsma
You can exclude the input's involvement by checking if other response writers 
do work. For me, the JSONResponseWriter works perfectly with the same returned 
data in some AJAX environment.

On Tuesday 01 February 2011 18:29:06 Sascha Szott wrote:
 Hi folks,
 
 I've made the same observation when working with Solr's
 ExtractingRequestHandler on the command line (no browser interaction).
 
 When issuing the following curl command
 
 curl
 'http://mysolrhost/solr/update/extract?extractOnly=trueextractFormat=text;
 wt=xmlresource.name=foo.pdf' --data-binary @foo.pdf -H
 'Content-type:text/xml; charset=utf-8'  foo.xml
 
 Solr's XML response writer returns malformed xml, e.g., xmllint gives me:
 
 foo.xml:21: parser error : Char 0xD835 out of allowed range
 foo.xml:21: parser error : PCDATA invalid Char value 55349
 
 I'm not totally sure, if this is an Tika/PDFBox issue. However, I would
 expect in every case that the XML output produced by Solr is well-formed
 even if the libraries used under the hood return garbage.
 
 
 -Sascha
 
 p.s. I can provide the pdf file in question, if anybody would like to
 see it in action.
 
 On 01.02.2011 16:43, Markus Jelsma wrote:
  There is an issue with the XML response writer. It cannot cope with some
  very exotic characters or possibly the right-to-left writing systems.
  The issue can be reproduced by indexing the content of the home page of
  wikipedia as it contains a lot of exotic matter. The problem does not
  affect the JSON response writer.
  
  The problem is, i am unsure whether this is a bug in Solr or that perhaps
  Firefox itself trips over.
  
  
  Here's the output of the JSONResponeWriter for a query returning the home
  page:
  {
  
responseHeader:{

 status:0,
 QTime:1,
 params:{
  
  fl:url,content,
  indent:true,
  wt:json,
  q:*:*,
  rows:1}},
  
response:{numFound:6744,start:0,docs:[
  
  {
  
   url:http://www.wikipedia.org/;,
   content:Wikipedia English The Free Encyclopedia 3 543 000+ articles
   日
  
  本語 フリー百科事典 730 000+ 記事 Deutsch Die freie Enzyklopädie 1 181 000+ Artikel
  Español La enciclopedia libre 710 000+ artículos Français L’encyclopédie
  libre 1 061 000+ articles Русский Свободная энциклопедия 654 000+ статей
  Italiano L’enciclopedia libera 768 000+ voci Português A enciclopédia
  livre 669 000+ artigos Polski Wolna encyklopedia 769 000+ haseł
  Nederlands De vrije encyclopedie 668 000+ artikelen Search  • Suchen  •
  Rechercher  • Szukaj  • Ricerca  • 検索  • Buscar  • Busca  • Zoeken  •
  Поиск  • Sök  • 搜尋  • Cerca  • Søk  • Haku  • Пошук  • Hledání  •
  Keresés  • Căutare  • 찾기  • Tìm kiếm  • Ara • Cari  • Søg  • بحث  •
  Serĉu  • Претрага  • Paieška  • Hľadať  • Suk  • جستجو • חיפוש  •
  Търсене  • Poišči  • Cari  • Bilnga العربية Български Català Česky Dansk
  Deutsch English Español Esperanto فارسی Français 한국어 Bahasa Indonesia
  Italiano עברית Lietuvių Magyar Bahasa Melayu Nederlands 日本語 Norsk
  (bokmål) Polski Português Română Русский Slovenčina Slovenščina Српски /
  Srpski Suomi Svenska Türkçe Українська Tiếng Việt Volapük Winaray 中文  
  100 000+   العربية • Български  • Català  • Česky  • Dansk  • Deutsch  •
  English  • Español  • Esperanto  • فارسی  • Français  • 한국어  • Bahasa
  Indonesia  • Italiano  • עברית • Lietuvių  • Magyar  • Bahasa Melayu  •
  Nederlands  • 日本語  • Norsk (bokmål) • Polski  • Português  • Русский  •
  Română  • Slovenčina  • Slovenščina  • Српски / Srpski  • Suomi  •
  Svenska  • Türkçe  • Українська  • Tiếng Việt  • Volapük  • Winaray  •
  中文   10 000+   Afrikaans  • Aragonés  • Armãneashce  • Asturianu  •
  Kreyòl Ayisyen  • Azərbaycan / آذربايجان ديلی  • বাংলা  • Беларуская (
  Акадэмічная  • Тарашкевiца )  • বিষ্ণুপ্রিযা় মণিপুরী  • Bosanski  •
  Brezhoneg  • Чăваш • Cymraeg  • Eesti  • Ελληνικά  • Euskara  • Frysk  •
  Gaeilge  • Galego  • ગુજરાતી  • Հայերեն  • हिन्दी  • Hrvatski  • Ido  •
  Íslenska  • Basa Jawa  • ಕನ್ನಡ  • ქართული  • Kurdî / كوردی  • Latina  •
  Latviešu  • Lëtzebuergesch  • Lumbaart • Македонски  • മലയാളം  • मराठी 
  • नेपाल भाषा  • नेपाली  • Norsk (nynorsk)  • Nnapulitano • Occitan  •
  Piemontèis  • Plattdüütsch  • Ripoarisch  • Runa Simi  • شاہ مکھی پنجابی
   • Shqip  • Sicilianu  • Simple English  • Sinugboanon  • Srpskohrvatski
  / Српскохрватски  • Basa Sunda  • Kiswahili  • Tagalog  • தமிழ் • తెలుగు
   • ไทย  • اردو  • Walon  • Yorùbá  • 粵語  • Žemaitėška   1 000+   Bahsa
  Acèh  • Alemannisch  • አማርኛ  • Arpitan  • ܐܬܘܪܝܐ  • Avañe’ẽ  • Aymar Aru
   • Bân-lâm-gú  • Bahasa Banjar  • Basa Banyumasan  • Башҡорт  • भोजपुरी 
  • Bikol Central  • Boarisch  • བོད་ཡིག  • Chavacano de Zamboanga  •
  Corsu  • Deitsch  • ދިވެހި  • Diné Bizaad  • Eald Englisc  •
  Emigliàn–Rumagnòl  • Эрзянь  • Estremeñu • Fiji Hindi  • Føroyskt  •
  Furlan  • Gaelg  • Gàidhlig  • 贛語  • گیلکی  • Hak- kâ-fa / 客家話  • Хальмг
   • ʻŌlelo Hawaiʻi  • Hornjoserbsce  • Ilokano  • Interlingua  •
  Interlingue  • Ирон Æвзаг  • 

Re: Malformed XML with exotic characters

2011-02-01 Thread Sascha Szott

Hi Markus,

in my case the JSON response writer returns valid JSON. The same holds 
for the PHP response writer.


-Sascha

On 01.02.2011 18:44, Markus Jelsma wrote:

You can exclude the input's involvement by checking if other response writers
do work. For me, the JSONResponseWriter works perfectly with the same returned
data in some AJAX environment.

On Tuesday 01 February 2011 18:29:06 Sascha Szott wrote:

Hi folks,

I've made the same observation when working with Solr's
ExtractingRequestHandler on the command line (no browser interaction).

When issuing the following curl command

curl
'http://mysolrhost/solr/update/extract?extractOnly=trueextractFormat=text;
wt=xmlresource.name=foo.pdf' --data-binary @foo.pdf -H
'Content-type:text/xml; charset=utf-8'  foo.xml

Solr's XML response writer returns malformed xml, e.g., xmllint gives me:

foo.xml:21: parser error : Char 0xD835 out of allowed range
foo.xml:21: parser error : PCDATA invalid Char value 55349

I'm not totally sure, if this is an Tika/PDFBox issue. However, I would
expect in every case that the XML output produced by Solr is well-formed
even if the libraries used under the hood return garbage.


-Sascha

p.s. I can provide the pdf file in question, if anybody would like to
see it in action.

On 01.02.2011 16:43, Markus Jelsma wrote:

There is an issue with the XML response writer. It cannot cope with some
very exotic characters or possibly the right-to-left writing systems.
The issue can be reproduced by indexing the content of the home page of
wikipedia as it contains a lot of exotic matter. The problem does not
affect the JSON response writer.

The problem is, i am unsure whether this is a bug in Solr or that perhaps
Firefox itself trips over.


Here's the output of the JSONResponeWriter for a query returning the home
page:
{

   responseHeader:{

status:0,
QTime:1,
params:{

fl:url,content,
indent:true,
wt:json,
q:*:*,
rows:1}},

   response:{numFound:6744,start:0,docs:[

{

 url:http://www.wikipedia.org/;,
 content:Wikipedia English The Free Encyclopedia 3 543 000+ articles
 日

本語 フリー百科事典 730 000+ 記事 Deutsch Die freie Enzyklopädie 1 181 000+ Artikel
Español La enciclopedia libre 710 000+ artículos Français L’encyclopédie
libre 1 061 000+ articles Русский Свободная энциклопедия 654 000+ статей
Italiano L’enciclopedia libera 768 000+ voci Português A enciclopédia
livre 669 000+ artigos Polski Wolna encyklopedia 769 000+ haseł
Nederlands De vrije encyclopedie 668 000+ artikelen Search  • Suchen  •
Rechercher  • Szukaj  • Ricerca  • 検索  • Buscar  • Busca  • Zoeken  •
Поиск  • Sök  • 搜尋  • Cerca  • Søk  • Haku  • Пошук  • Hledání  •
Keresés  • Căutare  • 찾기  • Tìm kiếm  • Ara • Cari  • Søg  • بحث  •
Serĉu  • Претрага  • Paieška  • Hľadať  • Suk  • جستجو • חיפוש  •
Търсене  • Poišči  • Cari  • Bilnga العربية Български Català Česky Dansk
Deutsch English Español Esperanto فارسی Français 한국어 Bahasa Indonesia
Italiano עברית Lietuvių Magyar Bahasa Melayu Nederlands 日本語 Norsk
(bokmål) Polski Português Română Русский Slovenčina Slovenščina Српски /
Srpski Suomi Svenska Türkçe Українська Tiếng Việt Volapük Winaray 中文
100 000+   العربية • Български  • Català  • Česky  • Dansk  • Deutsch  •
English  • Español  • Esperanto  • فارسی  • Français  • 한국어  • Bahasa
Indonesia  • Italiano  • עברית • Lietuvių  • Magyar  • Bahasa Melayu  •
Nederlands  • 日本語  • Norsk (bokmål) • Polski  • Português  • Русский  •
Română  • Slovenčina  • Slovenščina  • Српски / Srpski  • Suomi  •
Svenska  • Türkçe  • Українська  • Tiếng Việt  • Volapük  • Winaray  •
中文   10 000+   Afrikaans  • Aragonés  • Armãneashce  • Asturianu  •
Kreyòl Ayisyen  • Azərbaycan / آذربايجان ديلی  • বাংলা  • Беларуская (
Акадэмічная  • Тарашкевiца )  • বিষ্ণুপ্রিযা় মণিপুরী  • Bosanski  •
Brezhoneg  • Чăваш • Cymraeg  • Eesti  • Ελληνικά  • Euskara  • Frysk  •
Gaeilge  • Galego  • ગુજરાતી  • Հայերեն  • हिन्दी  • Hrvatski  • Ido  •
Íslenska  • Basa Jawa  • ಕನ್ನಡ  • ქართული  • Kurdî / كوردی  • Latina  •
Latviešu  • Lëtzebuergesch  • Lumbaart • Македонски  • മലയാളം  • मराठी
• नेपाल भाषा  • नेपाली  • Norsk (nynorsk)  • Nnapulitano • Occitan  •
Piemontèis  • Plattdüütsch  • Ripoarisch  • Runa Simi  • شاہ مکھی پنجابی
  • Shqip  • Sicilianu  • Simple English  • Sinugboanon  • Srpskohrvatski
/ Српскохрватски  • Basa Sunda  • Kiswahili  • Tagalog  • தமிழ் • తెలుగు
  • ไทย  • اردو  • Walon  • Yorùbá  • 粵語  • Žemaitėška   1 000+   Bahsa
Acèh  • Alemannisch  • አማርኛ  • Arpitan  • ܐܬܘܪܝܐ  • Avañe’ẽ  • Aymar Aru
  • Bân-lâm-gú  • Bahasa Banjar  • Basa Banyumasan  • Башҡорт  • भोजपुरी
• Bikol Central  • Boarisch  • བོད་ཡིག  • Chavacano de Zamboanga  •
Corsu  • Deitsch  • ދިވެހި  • Diné Bizaad  • Eald Englisc  •
Emigliàn–Rumagnòl  • Эрзянь  • Estremeñu • Fiji Hindi  • Føroyskt  •
Furlan  • Gaelg  • Gàidhlig  • 贛語  • گیلکی  • Hak- kâ-fa / 客家話  • Хальмг
  • ʻŌlelo Hawaiʻi  • Hornjoserbsce  • Ilokano  • 

Re: EdgeNgram Auto suggest - doubles ignore

2011-02-01 Thread johnnyisrael

Hi Erick,

I tried to use terms component, I got ended up with the following problems.

Problem: 1

Custom Sort not working in terms component:

http://lucene.472066.n3.nabble.com/Term-component-sort-is-not-working-td1905059.html#a1909386

I want to sort using one of my custom field[value_score], I gave it aleady
in my configuration, but it is not sorting properly.

The following are the configuration in solrconfig.xml

  searchComponent name=termsComponent
class=org.apache.solr.handler.component.TermsComponent/

  requestHandler name=/terms
class=org.apache.solr.handler.component.SearchHandler
 lst name=defaults
bool name=termstrue/bool
str name=wtjson/str
str name=flname/str
str name=sortvalue_score desc/str
str name=indenttrue/str
/lst 
arr name=components
  strtermsComponent/str
/arr
  /requestHandler

The SOLR response tag is not returned based on sorted parameter.

Problem: 2

Cap sensitive problem: [I am searching for Apple]

http://localhost/solr/core1/terms?terms.fl=nameterms.prefix=apple -- not
working

http://localhost/solr/core1/terms?terms.fl=nameterms.prefix=Apple --
working

Tried regex to overcome cap-sensitive problem: 

http://localhost/solr/core1/terms?terms.fl=nameterms.regex=Appleterms.regex.flag=case_insensitive

Is this regex based search will help me for my requirement?

It is returning irrelevant results. I am using the same syntax it is
mentioned in WIKI.

http://wiki.apache.org/solr/TermsComponent

Am I going wrong anywhere?

Please let me know if you need any more info.

Thanks,

Johnny
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2399330.html
Sent from the Solr - User mailing list archive at Nabble.com.


SolrIndexWriter was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!!

2011-02-01 Thread Ravi Kiran
Hello,
  While reloading a core I got this following error, when does this
occur ? Prior to this exception I do not see anything wrong in the logs.

[#|2011-02-01T13:02:36.697-0500|SEVERE|sun-appserver2.1|org.apache.solr.servlet.SolrDispatchFilter|_ThreadID=25;_ThreadName=httpWorkerThread-9001-5;_RequestID=450f6337-1f5c-42bc-a572-f0924de36b56;|org.apache.lucene.store.LockObtainFailedException:
Lock obtain timed out: NativeFSLock@
/data/solr/core/solr-data/index/lucene-7dc773a074342fa21d7d5ba09fc80678-write.lock
at org.apache.lucene.store.Lock.obtain(Lock.java:85)
at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1565)
at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1421)
at
org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:191)
at
org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:98)
at
org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:173)
at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:220)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
at
org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:246)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:214)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:313)
at
org.apache.catalina.core.StandardContextValve.invokeInternal(StandardContextValve.java:287)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:218)
at
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:648)
at
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:593)
at com.sun.enterprise.web.WebPipeline.invoke(WebPipeline.java:94)
at
com.sun.enterprise.web.PESessionLockingStandardPipeline.invoke(PESessionLockingStandardPipeline.java:98)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:222)
at
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:648)
at
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:593)
at
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:587)
at
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:1096)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:166)
at
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:648)
at
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:593)
at
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:587)
at
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:1096)
at
org.apache.coyote.tomcat5.CoyoteAdapter.service(CoyoteAdapter.java:290)
at
com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.invokeAdapter(DefaultProcessorTask.java:647)
at
com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.doProcess(DefaultProcessorTask.java:579)
at
com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.process(DefaultProcessorTask.java:831)
at
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.executeProcessorTask(DefaultReadTask.java:341)
at
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.doTask(DefaultReadTask.java:263)
at
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.doTask(DefaultReadTask.java:214)
at
com.sun.enterprise.web.connector.grizzly.TaskBase.run(TaskBase.java:265)
at
com.sun.enterprise.web.connector.grizzly.WorkerThreadImpl.run(WorkerThreadImpl.java:116)
|#]

[#|2011-02-01T13:02:40.330-0500|SEVERE|sun-appserver2.1|org.apache.solr.update.SolrIndexWriter|_ThreadID=82;_ThreadName=Finalizer;_RequestID=121fac59-7b08-46b9-acaa-5c5462418dc7;|SolrIndexWriter
was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE
LEAK!!!|#]

[#|2011-02-01T13:02:40.330-0500|SEVERE|sun-appserver2.1|org.apache.solr.update.SolrIndexWriter|_ThreadID=82;_ThreadName=Finalizer;_RequestID=121fac59-7b08-46b9-acaa-5c5462418dc7;|SolrIndexWriter
was not closed prior to 

Re: Solr Indexing Performance

2011-02-01 Thread Darx Oman
Thanx  Tomas
I'll try with different configuration


Re: Malformed XML with exotic characters

2011-02-01 Thread Robert Muir
Hi, it might only be a problem with your xml tools (e.g. firefox).
the problem here is characters outside of the basic multilingual plane
(in this case Gothic).
XML tools typically fall apart on these portions of unicode (in lucene
we recently reverted to a patched/hacked copy of xerces specifically
for this reason).

If you care about characters outside of the basic multilingual plane
actually working, unfortunately you have to start being very very very
particular about what software you use... you can assume most
software/setups WON'T work.
For example, if you were to use mysql's utf8 character set you would
find it doesn't actually support all of UTF-8! in this case you would
need to use the recent 'utf8mb4' or something instead, that is
actually utf-8!
Thats just one example of a well-used piece of software that suffers
from issues like this, there are others.

Its for reasons like these that if support for these languages is
important to you, I would stick with the most simple/textual methods
for input and output: e.g. using things like CSV and JSON if you can.
I would also fully test every component/jar in your application
individually and once you get it working, don't ever upgrade.

In any case, if you are having problems with characters outside of the
basic multilingual plane, and you suspect its actually a bug in Solr,
please open a JIRA issue, especially if you can provide some way to
reproduce it

On Tue, Feb 1, 2011 at 10:43 AM, Markus Jelsma
markus.jel...@openindex.io wrote:
 There is an issue with the XML response writer. It cannot cope with some very
 exotic characters or possibly the right-to-left writing systems. The issue can
 be reproduced by indexing the content of the home page of wikipedia as it
 contains a lot of exotic matter. The problem does not affect the JSON response
 writer.

 The problem is, i am unsure whether this is a bug in Solr or that perhaps
 Firefox itself trips over.


 Here's the output of the JSONResponeWriter for a query returning the home
 page:
 {
  responseHeader:{
  status:0,
  QTime:1,
  params:{
        fl:url,content,
        indent:true,
        wt:json,
        q:*:*,
        rows:1}},
  response:{numFound:6744,start:0,docs:[
        {
         url:http://www.wikipedia.org/;,
         content:Wikipedia English The Free Encyclopedia 3 543 000+ 
 articles 日
 本語 フリー百科事典 730 000+ 記事 Deutsch Die freie Enzyklopädie 1 181 000+ Artikel
 Español La enciclopedia libre 710 000+ artículos Français L’encyclopédie libre
 1 061 000+ articles Русский Свободная энциклопедия 654 000+ статей Italiano
 L’enciclopedia libera 768 000+ voci Português A enciclopédia livre 669 000+
 artigos Polski Wolna encyklopedia 769 000+ haseł Nederlands De vrije
 encyclopedie 668 000+ artikelen Search  • Suchen  • Rechercher  • Szukaj  •
 Ricerca  • 検索  • Buscar  • Busca  • Zoeken  • Поиск  • Sök  • 搜尋  • Cerca  •
 Søk  • Haku  • Пошук  • Hledání  • Keresés  • Căutare  • 찾기  • Tìm kiếm  • Ara
 • Cari  • Søg  • بحث  • Serĉu  • Претрага  • Paieška  • Hľadať  • Suk  • جستجو
 • חיפוש  • Търсене  • Poišči  • Cari  • Bilnga العربية Български Català Česky
 Dansk Deutsch English Español Esperanto فارسی Français 한국어 Bahasa Indonesia
 Italiano עברית Lietuvių Magyar Bahasa Melayu Nederlands 日本語 Norsk (bokmål)
 Polski Português Română Русский Slovenčina Slovenščina Српски / Srpski Suomi
 Svenska Türkçe Українська Tiếng Việt Volapük Winaray 中文   100 000+   العربية
 • Български  • Català  • Česky  • Dansk  • Deutsch  • English  • Español  •
 Esperanto  • فارسی  • Français  • 한국어  • Bahasa Indonesia  • Italiano  • עברית
 • Lietuvių  • Magyar  • Bahasa Melayu  • Nederlands  • 日本語  • Norsk (bokmål)
 • Polski  • Português  • Русский  • Română  • Slovenčina  • Slovenščina  •
 Српски / Srpski  • Suomi  • Svenska  • Türkçe  • Українська  • Tiếng Việt  •
 Volapük  • Winaray  • 中文   10 000+   Afrikaans  • Aragonés  • Armãneashce  •
 Asturianu  • Kreyòl Ayisyen  • Azərbaycan / آذربايجان ديلی  • বাংলা  • 
 Беларуская
 ( Акадэмічная  • Тарашкевiца )  • বিষ্ণুপ্রিযা় মণিপুরী  • Bosanski  • 
 Brezhoneg  • Чăваш
 • Cymraeg  • Eesti  • Ελληνικά  • Euskara  • Frysk  • Gaeilge  • Galego  •
 ગુજરાતી  • Հայերեն  • हिन्दी  • Hrvatski  • Ido  • Íslenska  • Basa Jawa  • 
 ಕನ್ನಡ  •
 ქართული  • Kurdî / كوردی  • Latina  • Latviešu  • Lëtzebuergesch  • Lumbaart
 • Македонски  • മലയാളം  • मराठी  • नेपाल भाषा  • नेपाली  • Norsk (nynorsk)  • 
 Nnapulitano
 • Occitan  • Piemontèis  • Plattdüütsch  • Ripoarisch  • Runa Simi  • شاہ مکھی
 پنجابی  • Shqip  • Sicilianu  • Simple English  • Sinugboanon  •
 Srpskohrvatski / Српскохрватски  • Basa Sunda  • Kiswahili  • Tagalog  • தமிழ்
 • తెలుగు  • ไทย  • اردو  • Walon  • Yorùbá  • 粵語  • Žemaitėška   1 000+   
 Bahsa
 Acèh  • Alemannisch  • አማርኛ  • Arpitan  • ܐܬܘܪܝܐ  • Avañe’ẽ  • Aymar Aru  •
 Bân-lâm-gú  • Bahasa Banjar  • Basa Banyumasan  • Башҡорт  • भोजपुरी  • Bikol
 Central  • Boarisch  • བོད་ཡིག  • Chavacano de Zamboanga  • Corsu  • Deitsch  
 •
 ދިވެހި  • Diné Bizaad  • Eald Englisc  

Re: Sending binary data as part of a query

2011-02-01 Thread Jay Luker
On Mon, Jan 31, 2011 at 9:22 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 that class should probably have been named ContentStreamUpdateHandlerBase
 or something like that -- it tries to encapsulate the logic that most
 RequestHandlers using COntentStreams (for updating) need to worry about.

 Your QueryComponent (as used by SearchHandler) should be able to access
 the ContentStreams the same way that class does ... call
 req.getContentStreams().

 Sending a binary stream from a remote client depends on how the client is
 implemented -- you can do it via HTTP using the POST body (with or w/o
 multi-part mime) in any langauge you want. If you are using SolrJ you may
 again run into an assumption that using ContentStreams means you are doing
 an Update but that's just a vernacular thing ... something like a
 ContentStreamUpdateRequest should work just as well for a query (as long
 as you set the neccessary params and/or request handler path)

Thanks for the help. I was just about to reply to my own question for
the benefit of future googlers when I noticed your response. :)

I actually got this working, much the way you suggest. The client is
python. I created a gist with the script I used for testing [1].

On the solr side my QueryComponent grabs the stream, uses
jzlib.ZInputStream to do the deflating, then translates the incoming
integers in the bitset (my solr schema.xml integer ids) to the lucene
ids and creates a docSetFilter with them.

Very relieved to get this working as it's the basis of a talk I'm
giving next week [2]. :-)

--jay

[1] https://gist.github.com/806397
[2] http://code4lib.org/conference/2011/luker


Re: Lock obtain timed out: NativeFSLock

2011-02-01 Thread Alex Thurlow
I'm going to go ahead and replay to myself since I solved my problem.  
It seems I was doing one more update to the data at the end and wasn't 
doing a commit, so it then couldn't write to the other core.  Adding the 
last commit seems to have fixed everything.


On 2/1/2011 11:08 AM, Alex Thurlow wrote:
I recently added a second core to my solr setup, and I'm now running 
into this Lock obtain timed out error when I try to update one core 
after I've updated another core.


In my update process, I add/update 1000 documents at a time and commit 
in between.  Then at the end, I commit and optimize.  The update of 
the new core has about 150k documents.  If I try to update the old 
core any time after updating the new core (even a couple hours later), 
I get the below error.  I've tried switching to the simple lock, but 
that didn't change anything.  I've tried this on solr 1.4 and 1.4.1 
both with the spatial-solr-2.0-RC2 plugin loaded.


If I restart solr, I can then update the old core again.

Does anyone have any insight for me here?



Feb 1, 2011 10:59:57 AM org.apache.solr.common.SolrException log
SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain 
timed out: 
NativeFSLock@./solr/data/index/lucene-088f283afa122cf05ce7eadb1b5ce07b-write.lock

at org.apache.lucene.store.Lock.obtain(Lock.java:85)
at 
org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1545)
at 
org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1402)
at 
org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:190)
at 
org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:98)
at 
org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:173)
at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:220)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
at 
org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)

at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at 
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)

at org.mortbay.jetty.Server.handle(Server.java:285)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)

at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
at 
org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
at 
org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at 
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)







Solr and Eclipse

2011-02-01 Thread Eric Grobler
Hi

I am a newbie and I am trying to run solr in eclipse.

From this url
http://wiki.apache.org/solr/HowToContribute#Development_Environment_Tips
there is a subclipse example:

I use Team - Share Project and this url:
  http://svn.apache.org/repos/asf/lucene/dev/trunk

but I get a access forbidden for unknown reason error

I assume using readonly http I do not need credentials?

Also, would it make more sense to rather checkout the project with
the command-line svn and in Eclipse use
Create project from existing source?


Thanks
Ericz


best practice for solr-power jsp?

2011-02-01 Thread Paul Libbrecht

Hello list,

this was asked again recently but I still see no answer.
What is the best practice to write jsp files that are, for example, search 
results of solr?

The only relevant thing I found is
http://www.ibm.com/developerworks/java/library/j-solr1/
Search smarter with Apache Solr at IBM devleoper works by Grant Ingersoll.

But that one is very old.

Imitating it, I would write a servlet that grabs the components and puts them 
as request attributes then dispatch to a jsp.

Is there a better way?
Is such code already part of a more widespread distribution?
I know there's velocity, and that one works well, but testing in velocity is 
really too much a pain.

paul

Re: Lucene 3.0.3 index cannot be read by Solr

2011-02-01 Thread Grijesh

solr1.4 is compatible for lucene2.9 version.
If your index version is 3.0.3 then it can not be read by lucene2.9 version.

You can try to change solr's lucene2.9 jar with your lucene3.0.3 jar and
restart your server
Hope it may work.

-
Thanx:
Grijesh
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Lucene-3-0-3-index-cannot-be-read-by-Solr-tp2396649p2403161.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR 1.4 and Lucene 3.0.3 index problem

2011-02-01 Thread Grijesh

You can extract the solr.war using java's jar -xvf solr.war  command

change the lucene-2.9.jar with your lucene-3.0.3.jar in WEB-INF/lib
directory

then use jar -cxf solr.war * to again pack the war

deploy that war hope that work

-
Thanx:
Grijesh
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-1-4-and-Lucene-3-0-3-index-problem-tp2396605p2403542.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: nested faceting ?

2011-02-01 Thread Grijesh

Another Patch is also available for Hierarchical faceting is

https://issues.apache.org/jira/browse/SOLR-64

You can look at this ,may solve your problem

-
Thanx:
Grijesh
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/nested-faceting-tp2389841p2403601.html
Sent from the Solr - User mailing list archive at Nabble.com.