Re: data-import runned by cron job withou wating the end of the previous one

2008-09-23 Thread sunnyfr

when I try without adaptive parameter I've OOME:

HTTP Status 500 - Java heap space java.lang.OutOfMemoryError: Java heap
space 


Shalin Shekhar Mangar wrote:
 
 On Mon, Sep 22, 2008 at 9:19 PM, sunnyfr [EMAIL PROTECTED] wrote:
 

 Hi,
 There is something wierd :
 I've plan cron job every 5mn which heat delta-import's url and works fine
 :
 The point is : It does look like if it doesn't check every data for
 updating
 or creating a new one :
 Because every 5mn the delta importa is started again : (even like if
 delta-import is not done)

 
 That should not be happening. Why do you feel it is starting again without
 waiting for the previous import to finish?
 
 

 str name=statusidle/str
 str name=importResponse/
 −
 lst name=statusMessages
 str name=Time Elapsed0:2:23.885/str
 str name=Total Requests made to DataSource1/str
 str name=Total Rows Fetched1863146/str
 str name=Total Documents Processed0/str
 str name=Total Documents Skipped0/str
 str name=Delta Dump started2008-09-22 17:40:01/str
 str name=Identifying Delta2008-09-22 17:40:01/str
 /lst

 
 I'm confused by this output. How frequently do you update your database?
 How
 many rows are modified in the database in that 5 minute period?
 
 What is the type of your last modified column in the database on which you
 use for identifying the deltas?
 
 

 and I wonder if it does come from my data-config file parameters :
 which is adaptive :

  dataSource type=JdbcDataSource
  driver=com.mysql.jdbc.Driver
  url=jdbc:mysql://master.books.com/books
  user=solr
  password=tah1Axie
batchSize=-1
  responseBuffering=adaptive/

 Thanks,

 
 The part on responseBuffering is not applicable for MySQL so you can
 remove
 that.
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 

-- 
View this message in context: 
http://www.nabble.com/data-import-runned-by-cron-job-withou-wating-the-end-of-the-previous-one-tp19610823p19622383.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: data-import runned by cron job withou wating the end of the previous one

2008-09-23 Thread sunnyfr

When I try without adaptive parameter I've an out of memory error.



Shalin Shekhar Mangar wrote:
 
 On Mon, Sep 22, 2008 at 9:19 PM, sunnyfr [EMAIL PROTECTED] wrote:
 

 Hi,
 There is something wierd :
 I've plan cron job every 5mn which heat delta-import's url and works fine
 :
 The point is : It does look like if it doesn't check every data for
 updating
 or creating a new one :
 Because every 5mn the delta importa is started again : (even like if
 delta-import is not done)

 
 That should not be happening. Why do you feel it is starting again without
 waiting for the previous import to finish?
 
 

 str name=statusidle/str
 str name=importResponse/
 −
 lst name=statusMessages
 str name=Time Elapsed0:2:23.885/str
 str name=Total Requests made to DataSource1/str
 str name=Total Rows Fetched1863146/str
 str name=Total Documents Processed0/str
 str name=Total Documents Skipped0/str
 str name=Delta Dump started2008-09-22 17:40:01/str
 str name=Identifying Delta2008-09-22 17:40:01/str
 /lst

 
 I'm confused by this output. How frequently do you update your database?
 How
 many rows are modified in the database in that 5 minute period?
 
 What is the type of your last modified column in the database on which you
 use for identifying the deltas?
 
 

 and I wonder if it does come from my data-config file parameters :
 which is adaptive :

  dataSource type=JdbcDataSource
  driver=com.mysql.jdbc.Driver
  url=jdbc:mysql://master.books.com/books
  user=solr
  password=tah1Axie
batchSize=-1
  responseBuffering=adaptive/

 Thanks,

 
 The part on responseBuffering is not applicable for MySQL so you can
 remove
 that.
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 

-- 
View this message in context: 
http://www.nabble.com/data-import-runned-by-cron-job-withou-wating-the-end-of-the-previous-one-tp19610823p19622498.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Solr Using

2008-09-23 Thread Dinesh Gupta





Hi Otis,

Currently I am creating indexes from Java standalone program. 

I am preparing data by using query  have made data to index.

Function as blow can we write.

I have large number of product  we want to user it at production level.

Please provide me sample or tutorials.


/**
 * 
 * 
 * @param pbi
 * @throws DAOException
 */
protected Document prepareLuceneDocument(Ismpbi pbi) throws DAOException {
long start = System.currentTimeMillis();
Long prn = pbi.getPbirfnum();
if (!isValidProduct(pbi)) {
if(logger.isDebugEnabled())
logger.debug(Product Discarded + prn+  not a valid product. 
);
discarded++;
return null;
}

IsmpptDAO pptDao = new IsmpptDAO();
Set categoryList = new HashSet(pptDao.findByProductCategories(prn));

Iterator iter = categoryList.iterator();
Set directCategories = new HashSet();
while (iter.hasNext()) {
Object[] obj = (Object[]) iter.next();
Long categoryId = (Long) obj[0];
String categoryName = (String) obj[1];
directCategories.add(new CategoryRecord(categoryId, categoryName));
}

if (directCategories.size() == 0) {
if(logger.isDebugEnabled())
logger.debug(Product Discarded + prn
+  not placed in any category directly [ismppt].);
discarded++;
return null;
}

// Get all the categories for the direct categories - contains
// CategoryRecord objects
Set categories = getCategories(directCategories, prn);
Set categoryIds = new HashSet(); // All category ids

Iterator it = categories.iterator();
while (it.hasNext()) {
CategoryRecord rec = (CategoryRecord) it.next();
categoryIds.add(rec.getId());
}

//All categories so far TOTAL (direct+parent categories)
if (categoryIds.size() == 0) {
if(logger.isDebugEnabled())
logger.debug(Product Discarded + prn+  direct categories are 
not placed under other categories.);
discarded++;
return null;
}

Set catalogues = getCatalogues(prn);
if (catalogues.size()!=0){
if(logger.isDebugEnabled())
logger.debug([ + prn + ]- Total Direct PCC Catalogues [ + 
collectionToStringNew(catalogues) +]);
}

getCatalogueWithAllChildInCCR(prn, categoryIds, catalogues);
if (catalogues.size() == 0) {
if(logger.isDebugEnabled())
logger.debug(Product Discarded  + prn+  not attached with 
any catalogue);
discarded++;
return null;
}

String productDirectCategories = collectionToString(directCategories);
String productAllCategories = collectionToString(categories);
String productAllCatalogues = collectionToStringNew(catalogues);

String categoryNames = getCategoryNames(categories);

if(logger.isInfoEnabled())
logger.info(TO Document Product  + pbi.getPbirfnum() +  Dir 
Categories  +
  productDirectCategories +  All Categories 
+ productAllCategories +  And Catalogues 
+ productAllCatalogues);

directCategories = null;
categories=null;
catalogues=null;


Document document = new ProductDocument().toDocument(pbi,
productAllCategories, productAllCatalogues,
productDirectCategories, categoryNames);

categoryNames =null;
pbi=null;
productAllCatalogues =null;
productAllCategories =null;
productDirectCategories=null;
categoryNames=null;

long time = System.currentTimeMillis() - start;
if (time  longestIndexTime) {
longestIndexTime = time;
}
return document;
}



 Date: Mon, 22 Sep 2008 22:10:16 -0700
 From: [EMAIL PROTECTED]
 Subject: Re: Solr Using
 To: solr-user@lucene.apache.org
 
 Dinesh,
 
 Please have a look at the Solr tutorial first.
 Then have a look at the new DataImportHandler - there is a very detailed page 
 about it on the Wiki.
 
 
 Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
 - Original Message 
  From: Dinesh Gupta [EMAIL PROTECTED]
  To: solr-user@lucene.apache.org
  Sent: Tuesday, September 23, 2008 1:02:34 AM
  Subject: Solr Using
  
  
  
  Hi All,
  
  I am new to Solr. I am using Lucene last 2 years.
  
  We create Lucene indexes for database.
  
  Please help to migrate to Solr.
  
  How can achieve this.
  
  If any one have idea, please help.
  
  Thanks In Advance.
  
  
  Regards,
  Dinesh Gupta
  
  _
  Search for videos of Bollywood, Hollywood, Mollywood and 

Re: Searching for future or null dates

2008-09-23 Thread Michael Lackhoff
On 23.09.2008 00:30 Chris Hostetter wrote:

 : Here is what I was able to get working with your help.
 : 
 : (productId:(102685804)) AND liveDate:[* TO NOW] AND ((endDate:[NOW TO *]) OR
 : ((*:* -endDate:[* TO *])))
 : 
 : the *:* is what I was missing.
 
 Please, PLEASE ... do yourself a favor and stop using AND and OR ...  
 food will taste better, flowers will smell fresher, and the world will be 
 a happy shinny place...
 
 +productId:102685804 +liveDate:[* TO NOW] +(endDate:[NOW TO *] (*:* 
 -endDate:[* TO *]))

I would also like to follow your advice but don't know how to do it with
defaultOperator=AND. What I am missing is the equivalent to OR:
AND: +
NOT: -
OR: ???
I didn't find anything on the Solr or Lucene query syntax pages. If
there is such an equivalent then I guess the query would become:
productId:102685804 liveDate:[* TO NOW] (endDate:[NOW TO *] OR(*:*
-endDate:[* TO *]))

I switched to the AND-default because that is the default in my web
frontend so I don't have to change logic. What should I do in this
situation? Go back to the OR-default?

It is not so much this example I am after but I have a syntax translater
in my application that must be able to handle similar expressions and I
want to keep it simple and still have tasty food ;-)

-Michael


Re: Solr Using

2008-09-23 Thread Shalin Shekhar Mangar
Hi Dinesh,

Your code is hardly useful to us since we don't know what you are trying to
achieve or what all those Dao classes do.

Look at the Solr tutorial first -- http://lucene.apache.org/solr/
Use the SolrJ client for communicating with Solr server --
http://wiki.apache.org/solr/Solrj
Also take a look at DataImportHandler which can help avoid all this code --
http://wiki.apache.org/solr/DataImportHandler

If you face any problem, first search this mailing list through markmail.orgor
nabble.com to find previous posts related to your issue. If you don't find
anything helpful, post specific questions here which we will help answer.

On Tue, Sep 23, 2008 at 3:56 PM, Dinesh Gupta [EMAIL PROTECTED]wrote:






 Hi Otis,

 Currently I am creating indexes from Java standalone program.

 I am preparing data by using query  have made data to index.

 Function as blow can we write.

 I have large number of product  we want to user it at production level.

 Please provide me sample or tutorials.


 /**
 *
 *
 * @param pbi
 * @throws DAOException
 */
protected Document prepareLuceneDocument(Ismpbi pbi) throws DAOException
 {
long start = System.currentTimeMillis();
Long prn = pbi.getPbirfnum();
if (!isValidProduct(pbi)) {
if(logger.isDebugEnabled())
logger.debug(Product Discarded + prn+  not a valid
 product. );
discarded++;
return null;
}

IsmpptDAO pptDao = new IsmpptDAO();
Set categoryList = new HashSet(pptDao.findByProductCategories(prn));

Iterator iter = categoryList.iterator();
Set directCategories = new HashSet();
while (iter.hasNext()) {
Object[] obj = (Object[]) iter.next();
Long categoryId = (Long) obj[0];
String categoryName = (String) obj[1];
directCategories.add(new CategoryRecord(categoryId,
 categoryName));
}

if (directCategories.size() == 0) {
if(logger.isDebugEnabled())
logger.debug(Product Discarded + prn
+  not placed in any category directly [ismppt].);
discarded++;
return null;
}

// Get all the categories for the direct categories - contains
// CategoryRecord objects
Set categories = getCategories(directCategories, prn);
Set categoryIds = new HashSet(); // All category ids

Iterator it = categories.iterator();
while (it.hasNext()) {
CategoryRecord rec = (CategoryRecord) it.next();
categoryIds.add(rec.getId());
}

//All categories so far TOTAL (direct+parent categories)
if (categoryIds.size() == 0) {
if(logger.isDebugEnabled())
logger.debug(Product Discarded + prn+  direct categories
 are not placed under other categories.);
discarded++;
return null;
}

Set catalogues = getCatalogues(prn);
if (catalogues.size()!=0){
if(logger.isDebugEnabled())
logger.debug([ + prn + ]- Total Direct PCC Catalogues [
 + collectionToStringNew(catalogues) +]);
}

getCatalogueWithAllChildInCCR(prn, categoryIds, catalogues);
if (catalogues.size() == 0) {
if(logger.isDebugEnabled())
logger.debug(Product Discarded  + prn+  not attached with
 any catalogue);
discarded++;
return null;
}

String productDirectCategories =
 collectionToString(directCategories);
String productAllCategories = collectionToString(categories);
String productAllCatalogues = collectionToStringNew(catalogues);

String categoryNames = getCategoryNames(categories);

if(logger.isInfoEnabled())
logger.info(TO Document Product  + pbi.getPbirfnum() +  Dir
 Categories  +
  productDirectCategories +  All Categories 
+ productAllCategories +  And Catalogues 
+ productAllCatalogues);

directCategories = null;
categories=null;
catalogues=null;


Document document = new ProductDocument().toDocument(pbi,
productAllCategories, productAllCatalogues,
productDirectCategories, categoryNames);

categoryNames =null;
pbi=null;
productAllCatalogues =null;
productAllCategories =null;
productDirectCategories=null;
categoryNames=null;

long time = System.currentTimeMillis() - start;
if (time  longestIndexTime) {
longestIndexTime = time;
}
return document;
}



  Date: Mon, 22 Sep 2008 22:10:16 -0700
  From: [EMAIL PROTECTED]
  Subject: Re: Solr Using
  To: solr-user@lucene.apache.org
 
  Dinesh,
 
  Please have a look at the Solr tutorial first.
  Then have a look at the new DataImportHandler - there is a very detailed
 page about it on the Wiki.

Lucene index

2008-09-23 Thread Dinesh Gupta

Hi,
Current we are using Lucene api to create index.

It creates index in a directory with 3 files like

xxx.cfs , deletable  segments.

If I am creating Lucene indexes from Solr, these file will be created or not?

Please give me example on MySQL data base instead of hsqldb


Regards,
Dinesh

_
Movies, sports  news! Get your daily entertainment fix, only on live.com
http://www.live.com/?scope=videoform=MICOAL

EmbeddedSolrServer and the MultiCore functionality

2008-09-23 Thread Aleksander M. Stensby
Hello everyone, I'm new to Solr (have been using Lucene for a few years  
now). We are looking into Solr and have heard many good things about the  
project:)


I have a few questions regarding the EmbeddedSolrServer in Solrj and the  
MultiCore features... I've tried to find answers to this in the archives  
but have not succeeded.
The thing is, I want to be able to use the Embedded server to access  
multiple cores on one machine, and I would like to at least have the  
possibility to access the lucene indexes without http. In particular I'm  
wondering if it is possible to do the shards (distributed search)  
approach using the embedded server, without using http requests.


lets say I register 2 cores to a container and init my embedded server  
like this:

CoreContainer container = new CoreContainer();
container.register(core1, core1, false);
container.register(core2, core2, false);
server = new EmbeddedSolrServer(container, core1);
then queries performed on my server will return results from core1... and  
if i do ..=new EmbeddedSolrServer(container, core2) the results will  
come from core2.


If i have solr up and running and do something like this:
query.set(shards,  
localhost:8080/solr/core0,localhost:8080/solr/core1);

I will get the results from both cores, obviously...

But is there a way to do this without using shards and accessing the cores  
through http?
I presume it would/should be possible to do the same thing directly  
against the cores, but my question is really if this has been implemented  
already / is it possible?



Thanks in advance for any replies!

Best regards,
 Aleksander


--
Aleksander M. Stensby
Senior Software Developer
Integrasco A/S
+47 41 22 82 72
[EMAIL PROTECTED]


Re: Lucene index

2008-09-23 Thread Shalin Shekhar Mangar
On Tue, Sep 23, 2008 at 5:33 PM, Dinesh Gupta [EMAIL PROTECTED]wrote:


 Hi,
 Current we are using Lucene api to create index.

 It creates index in a directory with 3 files like

 xxx.cfs , deletable  segments.

 If I am creating Lucene indexes from Solr, these file will be created or
 not?


The lucene index will be created in the solr_home inside the data/index
directory.


 Please give me example on MySQL data base instead of hsqldb


If you are talking about DataImportHandler then there is no difference in
the configuration except for using the MySql driver instead of hsqldb.

-- 
Regards,
Shalin Shekhar Mangar.


RE: Lucene index

2008-09-23 Thread Dinesh Gupta

Hi Shalin Shekhar,

Let me explain my issue.

I have some tables in my database like

Product
Category 
Catalogue
Keywords
Seller
Brand
Country_city_group
etc.
I have a class that represent  product document as

Document doc = new Document();
// Keywords which can be used directly for search
doc.add(new Field(id,(String) 
data.get(PRN),Field.Store.YES,Field.Index.UN_TOKENIZED));

// Sorting fields]
String priceString = (String) data.get(Price);
if (priceString == null)
priceString = 0;
long price = 0;
try {
price = (long) Double.parseDouble(priceString);
} catch (Exception e) {

}

doc.add(new 
Field(prc,NumberUtils.pad(price),Field.Store.YES,Field.Index.UN_TOKENIZED));
Date createDate = (Date) data.get(CreateDate);
if (createDate == null) createDate = new Date();

doc.add(new 
Field(cdt,String.valueOf(createDate.getTime()),Field.Store.NO,Field.Index.UN_TOKENIZED));

Date modiDate = (Date) data.get(ModiDate);
if (modiDate == null) modiDate = new Date();

doc.add(new 
Field(mdt,String.valueOf(modiDate.getTime()),Field.Store.NO,Field.Index.UN_TOKENIZED));
//doc.add(Field.UnStored(cdt, String.valueOf(createDate.getTime(;

// Additional fields for search
doc.add(new Field(bnm,(String) 
data.get(Brand),Field.Store.YES,Field.Index.TOKENIZED));
doc.add(new Field(bnm1,(String) 
data.get(Brand1),Field.Store.NO,Field.Index.UN_TOKENIZED));
//doc.add(Field.Text(bnm, (String) data.get(Brand))); //Tokenized 
and Unstored
doc.add(new Field(bid,(String) 
data.get(BrandId),Field.Store.YES,Field.Index.UN_TOKENIZED));
//doc.add(Field.Keyword(bid, (String) data.get(BrandId))); // 
untokenized 
doc.add(new Field(grp,(String) 
data.get(Group),Field.Store.NO,Field.Index.TOKENIZED));
//doc.add(Field.Text(grp, (String) data.get(Group)));
doc.add(new Field(gid,(String) 
data.get(GroupId),Field.Store.YES,Field.Index.UN_TOKENIZED));
//doc.add(Field.Keyword(gid, (String) data.get(GroupId))); //New
doc.add(new Field(snm,(String) 
data.get(Seller),Field.Store.YES,Field.Index.UN_TOKENIZED));
//doc.add(Field.Text(snm, (String) data.get(Seller)));
doc.add(new Field(sid,(String) 
data.get(SellerId),Field.Store.YES,Field.Index.UN_TOKENIZED));
//doc.add(Field.Keyword(sid, (String) data.get(SellerId))); // New
doc.add(new Field(ttl,(String) 
data.get(Title),Field.Store.YES,Field.Index.TOKENIZED));
//doc.add(Field.UnStored(ttl, (String) data.get(Title), true));

String title1 = (String) data.get(Title);
title1 = removeSpaces(title1);
doc.add(new 
Field(ttl1,title1,Field.Store.NO,Field.Index.UN_TOKENIZED));

doc.add(new Field(ttl2,title1,Field.Store.NO,Field.Index.TOKENIZED));
//doc.add(Field.UnStored(ttl, (String) data.get(Title), true));
 
// ColumnC - Product Sequence
String productSeq = (String) data.get(ProductSeq);
if (productSeq == null) productSeq = ;
doc.add(new 
Field(seq,productSeq,Field.Store.NO,Field.Index.UN_TOKENIZED));
//doc.add(Field.Keyword(seq, productSeq));

// New Added
doc.add(new Field(sdc,(String) 
data.get(SpecialDescription),Field.Store.NO,Field.Index.TOKENIZED));
//doc.add(Field.UnStored(sdc, (String) 
data.get(SpecialDescription),true));
doc.add(new Field(kdc, (String) 
data.get(KeywordDescription),Field.Store.NO,Field.Index.TOKENIZED));
//doc.add(Field.UnStored(kdc, (String) 
data.get(KeywordDescription),true));

// ColumnB - Product Category and parent categories
doc.add(new Field(cts,(String) 
data.get(Categories),Field.Store.YES,Field.Index.TOKENIZED));
//doc.add(Field.Text(cts, (String) data.get(Categories)));

// ColumnB - Product Category and parent categories //Raman
doc.add(new Field(dct,(String) 
data.get(DirectCategories),Field.Store.YES,Field.Index.TOKENIZED));
//doc.add(Field.Text(dct, (String) data.get(DirectCategories)));

// ColumnC - Product Catalogues
doc.add(new Field(clg,(String) 
data.get(Catalogues),Field.Store.YES,Field.Index.TOKENIZED));
//doc.add(Field.Text(clg, (String) data.get(Catalogues)));

//Product Delivery Cities
doc.add(new Field(dcty,(String) 
data.get(DelCities),Field.Store.YES,Field.Index.TOKENIZED));
// Additional Information
//Top Selling Count
String sellerCount=((Long)data.get(SellCount)).toString();
doc.add(new 
Field(bsc,sellerCount,Field.Store.YES,Field.Index.TOKENIZED));


I am preparing data from querying databse.
Please tell me how can I migrate my logic to Solr.
I have spend more than a week.
But have got nothing.
Please help me.

Can I attach my files here?

Thanks in Advance

Regards
Dinesh Gupta

 Date: 

Optimise while uploading?

2008-09-23 Thread Geoff Hopson
Hi,

Probably a stupid question with the obvious answer, but if I am
running a Solr master and accepting updates, do I have to stop the
updates when I start the optimise of the index? Or will optimise just
take the latest snapshot and work on that independently of the
incoming updates?

Really enjoying Solr, BTW. Nice job!

Thanks
Geoff


Re: snapshot.yyyymmdd ... can't found them?

2008-09-23 Thread sunnyfr

Yes In deed it was problem with the path .. thanks a lot,
Just didnt get this part  If you turn up your logging to FINE what does
that mean ?

Huge thanks for your answer,



hossman wrote:
 
 
 : And I did change my config file :
 : 
 : !-- A postCommit event is fired after every commit or optimize
 command
 : listener event=postCommit class=solr.RunExecutableListener
 
 ...that comment isn't closed, so perhaps you it's closed after the 
 /listener block and not getting used at all.  
 
 :   str name=exedata/solr/books/bin/snapshooter/str
 
 I would strongly recommend you make that an absolute path, it's possible 
 your working directory isn't what you think it is.
 
 RunExecutableListener will log any errors it encounters trying to run 
 scripts, so you should check your Solr log for those ... If you turn up 
 your logging to FINE you'll see messages everytime it runs something 
 even if it succeeds.
 
 
 -Hoss
 
 
 

-- 
View this message in context: 
http://www.nabble.com/snapshot.mmdd-...-can%27t-found-them--tp19556507p19627832.html
Sent from the Solr - User mailing list archive at Nabble.com.



commit

2008-09-23 Thread sunnyfr

Hi,

I don't know why when I start commit manually it doesn't fire snapshooter ?
I did it manually because no snapshot was created and if i run it manually
it works.

so my auto commit is activated (I think) :
autoCommit
  maxDocs1/maxDocs
  maxTime1000/maxTime
/autoCommit

My snapshooter too:
!-- A postCommit event is fired after every commit or optimize command
--
listener event=postCommit class=solr.RunExecutableListener
  str name=exe./data/solr/book/logs/snapshooter/str
  str name=dirdata/solr/book/bin/str
  bool name=waittrue/bool
  arr name=args strarg1/str strarg2/str /arr
  arr name=env strMYVAR=val1/str /arr
/listener

Update are done on the server :
str name=commanddelta-import/str
str name=statusidle/str
str name=importResponse/
−
lst name=statusMessages
str name=Total Requests made to DataSource1513/str
str name=Total Rows Fetched574/str
str name=Total Documents Skipped0/str
str name=Delta Dump started2008-09-23 16:00:01/str
str name=Identifying Delta2008-09-23 16:00:01/str
str name=Deltas Obtained2008-09-23 16:00:37/str
str name=Building documents2008-09-23 16:00:37/str
str name=Total Changed Documents216/str
−
str name=
Indexing completed. Added/Updated: 216 documents. Deleted 0 documents.
/str
str name=Committed2008-09-23 16:01:29/str
str name=Time taken 0:1:28.667/str
/lst

and everything is at the good place I think, my path are good ...


-- 
View this message in context: 
http://www.nabble.com/commit-tp19628500p19628500.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Lucene index

2008-09-23 Thread Shalin Shekhar Mangar
Hi Dinesh,

This seems straightforward for Solr. You can use the embedded jetty server
for a start. Look at the tutorial on how to get started.

You'll need to modify the schema.xml to define all the fields that you want
to index. The wiki page at http://wiki.apache.org/solr/SchemaXml is a good
start on how to do that. Each field in your code will have a counterpart in
the schema.xml with appropriate flags (indexed/stored/tokenized etc.)

Once that is complete, try to modify the DataImportHandler's hsqldb example
for your mysql database.

On Tue, Sep 23, 2008 at 7:01 PM, Dinesh Gupta [EMAIL PROTECTED]wrote:


 Hi Shalin Shekhar,

 Let me explain my issue.

 I have some tables in my database like

 Product
 Category
 Catalogue
 Keywords
 Seller
 Brand
 Country_city_group
 etc.
 I have a class that represent  product document as

 Document doc = new Document();
// Keywords which can be used directly for search
doc.add(new Field(id,(String)
 data.get(PRN),Field.Store.YES,Field.Index.UN_TOKENIZED));

// Sorting fields]
String priceString = (String) data.get(Price);
if (priceString == null)
priceString = 0;
long price = 0;
try {
price = (long) Double.parseDouble(priceString);
} catch (Exception e) {

}

doc.add(new
 Field(prc,NumberUtils.pad(price),Field.Store.YES,Field.Index.UN_TOKENIZED));
Date createDate = (Date) data.get(CreateDate);
if (createDate == null) createDate = new Date();

doc.add(new Field(cdt,String.valueOf(createDate.getTime()),
 Field.Store.NO,Field.Index.UN_TOKENIZED));

Date modiDate = (Date) data.get(ModiDate);
if (modiDate == null) modiDate = new Date();

doc.add(new Field(mdt,String.valueOf(modiDate.getTime()),
 Field.Store.NO,Field.Index.UN_TOKENIZED));
//doc.add(Field.UnStored(cdt,
 String.valueOf(createDate.getTime(;

// Additional fields for search
doc.add(new Field(bnm,(String)
 data.get(Brand),Field.Store.YES,Field.Index.TOKENIZED));
doc.add(new Field(bnm1,(String) data.get(Brand1),Field.Store.NO
 ,Field.Index.UN_TOKENIZED));
//doc.add(Field.Text(bnm, (String) data.get(Brand)));
 //Tokenized and Unstored
doc.add(new Field(bid,(String)
 data.get(BrandId),Field.Store.YES,Field.Index.UN_TOKENIZED));
//doc.add(Field.Keyword(bid, (String) data.get(BrandId))); //
 untokenized 
doc.add(new Field(grp,(String) data.get(Group),Field.Store.NO
 ,Field.Index.TOKENIZED));
//doc.add(Field.Text(grp, (String) data.get(Group)));
doc.add(new Field(gid,(String)
 data.get(GroupId),Field.Store.YES,Field.Index.UN_TOKENIZED));
//doc.add(Field.Keyword(gid, (String) data.get(GroupId))); //New
doc.add(new Field(snm,(String)
 data.get(Seller),Field.Store.YES,Field.Index.UN_TOKENIZED));
//doc.add(Field.Text(snm, (String) data.get(Seller)));
doc.add(new Field(sid,(String)
 data.get(SellerId),Field.Store.YES,Field.Index.UN_TOKENIZED));
//doc.add(Field.Keyword(sid, (String) data.get(SellerId))); //
 New
doc.add(new Field(ttl,(String)
 data.get(Title),Field.Store.YES,Field.Index.TOKENIZED));
//doc.add(Field.UnStored(ttl, (String) data.get(Title), true));

String title1 = (String) data.get(Title);
title1 = removeSpaces(title1);
doc.add(new Field(ttl1,title1,Field.Store.NO
 ,Field.Index.UN_TOKENIZED));

doc.add(new Field(ttl2,title1,Field.Store.NO
 ,Field.Index.TOKENIZED));
//doc.add(Field.UnStored(ttl, (String) data.get(Title), true));

// ColumnC - Product Sequence
String productSeq = (String) data.get(ProductSeq);
if (productSeq == null) productSeq = ;
doc.add(new Field(seq,productSeq,Field.Store.NO
 ,Field.Index.UN_TOKENIZED));
//doc.add(Field.Keyword(seq, productSeq));

// New Added
doc.add(new Field(sdc,(String) data.get(SpecialDescription),
 Field.Store.NO,Field.Index.TOKENIZED));
//doc.add(Field.UnStored(sdc, (String)
 data.get(SpecialDescription),true));
doc.add(new Field(kdc, (String) data.get(KeywordDescription),
 Field.Store.NO,Field.Index.TOKENIZED));
//doc.add(Field.UnStored(kdc, (String)
 data.get(KeywordDescription),true));

// ColumnB - Product Category and parent categories
doc.add(new Field(cts,(String)
 data.get(Categories),Field.Store.YES,Field.Index.TOKENIZED));
//doc.add(Field.Text(cts, (String) data.get(Categories)));

// ColumnB - Product Category and parent categories //Raman
doc.add(new Field(dct,(String)
 data.get(DirectCategories),Field.Store.YES,Field.Index.TOKENIZED));
//doc.add(Field.Text(dct, (String) data.get(DirectCategories)));

// ColumnC - Product Catalogues
doc.add(new Field(clg,(String)
 data.get(Catalogues),Field.Store.YES,Field.Index.TOKENIZED));
//doc.add(Field.Text(clg, (String) 

Re: Optimise while uploading?

2008-09-23 Thread Shalin Shekhar Mangar
On Tue, Sep 23, 2008 at 7:06 PM, Geoff Hopson [EMAIL PROTECTED]wrote:


 Probably a stupid question with the obvious answer, but if I am
 running a Solr master and accepting updates, do I have to stop the
 updates when I start the optimise of the index? Or will optimise just
 take the latest snapshot and work on that independently of the
 incoming updates?


Usually an optimize is performed at the end of the indexing operation.
However, an optimize operation will block incoming update requests until it
completes.

Snapshots are a different story. Solr does not even know about any snapshots
-- all operations are performed on the main index only. If you look under
the hoods, it is the snapshooter shell script which creates the snapshot
directories.

-- 
Regards,
Shalin Shekhar Mangar.


Re: commit

2008-09-23 Thread Shalin Shekhar Mangar
On Tue, Sep 23, 2008 at 7:36 PM, sunnyfr [EMAIL PROTECTED] wrote:


 My snapshooter too:
!-- A postCommit event is fired after every commit or optimize command
 --
listener event=postCommit class=solr.RunExecutableListener
  str name=exe./data/solr/book/logs/snapshooter/str
  str name=dirdata/solr/book/bin/str
  bool name=waittrue/bool
  arr name=args strarg1/str strarg2/str /arr
  arr name=env strMYVAR=val1/str /arr
/listener
 and everything is at the good place I think, my path are good ...


Those paths look strange. Are you sure your snapshooter script is inside a
directory named logs?

Try giving absolute paths to the snapshooter script in the exe section.
Also, put the absolute path to the bin directory in the dir section and
try again.

-- 
Regards,
Shalin Shekhar Mangar.


Re: commit

2008-09-23 Thread sunnyfr

Right my bad it was bin directory, but even when i fire commit no snapshot
created ??
Does it check the number of document even when i fire it and another
question I dont rember have put in the conf file the path to commit, but
even manually it doesnt work 

[EMAIL PROTECTED]:/# ./data/solr/book/bin/commit -V
+ [[ -n '' ]]
+ [[ -z 8180 ]]
+ [[ -z localhost ]]
+ [[ -z solr ]]
+ curl_url=http://localhost:8180/solr/update
+ fixUser -V
+ [[ -z root ]]
++ whoami
+ [[ root != root ]]
++ who -m
++ cut '-d ' -f1
++ sed '-es/^.*!//'
+ oldwhoami=root
+ [[ root == '' ]]
+ setStartTime
+ [[ Linux == \S\u\n\O\S ]]
++ date +%s
+ start=1222180545
+ logMessage started by root
++ timeStamp
++ date '+%Y/%m/%d %H:%M:%S'
+ echo 2008/09/23 16:35:45 started by root
+ [[ -n '' ]]
+ logMessage command: ./data/solr/book/bin/commit -V
++ timeStamp
++ date '+%Y/%m/%d %H:%M:%S'
+ echo 2008/09/23 16:35:45 command: ./data/solr/book/bin/commit -V
+ [[ -n '' ]]
++ curl http://localhost:8180/solr/update -s -H 'Content-type:text/xml;
charset=utf-8' -d 'commit/'
+ rs='result status=0/result'
+ [[ 0 != 0 ]]
+ echo 'result' 'status=0/result'
+ grep 'result.*status=0'
+ [[ 0 != 0 ]]
+ logExit ended 0
+ [[ Linux == \S\u\n\O\S ]]
++ date +%s
+ end=1222180546
++ expr 1222180546 - 1222180545
+ diff=1
++ timeStamp
++ date '+%Y/%m/%d %H:%M:%S'
+ echo '2008/09/23 16:35:46 ended (elapsed time: 1 sec)'
+ exit 0




Shalin Shekhar Mangar wrote:
 
 On Tue, Sep 23, 2008 at 7:36 PM, sunnyfr [EMAIL PROTECTED] wrote:
 

 My snapshooter too:
!-- A postCommit event is fired after every commit or optimize
 command
 --
listener event=postCommit class=solr.RunExecutableListener
  str name=exe./data/solr/book/logs/snapshooter/str
  str name=dirdata/solr/book/bin/str
  bool name=waittrue/bool
  arr name=args strarg1/str strarg2/str /arr
  arr name=env strMYVAR=val1/str /arr
/listener
 and everything is at the good place I think, my path are good ...


 Those paths look strange. Are you sure your snapshooter script is inside a
 directory named logs?
 
 Try giving absolute paths to the snapshooter script in the exe section.
 Also, put the absolute path to the bin directory in the dir section and
 try again.
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 

-- 
View this message in context: 
http://www.nabble.com/commit-tp19628500p19629217.html
Sent from the Solr - User mailing list archive at Nabble.com.



Refresh of synonyms.txt without reload

2008-09-23 Thread Batzenmann

Hi,

I'm quite new to solr and I'm looking for a way to extend the list of used
synonyms used at query-time without having to reload the config. What I've
found so far are these tow thread linked to below, of which neither really
helped me out.
Especially the MultiCore solution seems a little bit too much for 'just
reloading' the synonyms..

Right now I would choose a solution where I'd extend the
SynonymFilterFactory with a parameter for an interval in which it would look
for an update of the synonyms source file (synonyms.txt).
In case of an updated file the SynMap would be updated and from that point
on the new synonyms would be included in the query analysis.

Is this a valid approach? Would someone else find this usefull to?

cheers, Axel

http://www.nabble.com/SolrCore%2C-reload%2C-synonyms-not-reloaded-td19339767.html
Multiple Solr Cores 
http://www.nabble.com/Re%3A-Is-it-possible-to-add-synonyms-run-time--td15089111.html
Re: Is it possible to add synonyms run time? 
-- 
View this message in context: 
http://www.nabble.com/Refresh-of-synonyms.txt-without-reload-tp19629361p19629361.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Refresh of synonyms.txt without reload

2008-09-23 Thread Walter Underwood
This is probably not useful because synonyms work better at index time
than at query time. Reloading synonyms also requires reindexing all
the affected documents.

wunder

On 9/23/08 7:45 AM, Batzenmann [EMAIL PROTECTED] wrote:

 
 Hi,
 
 I'm quite new to solr and I'm looking for a way to extend the list of used
 synonyms used at query-time without having to reload the config. What I've
 found so far are these tow thread linked to below, of which neither really
 helped me out.
 Especially the MultiCore solution seems a little bit too much for 'just
 reloading' the synonyms..
 
 Right now I would choose a solution where I'd extend the
 SynonymFilterFactory with a parameter for an interval in which it would look
 for an update of the synonyms source file (synonyms.txt).
 In case of an updated file the SynMap would be updated and from that point
 on the new synonyms would be included in the query analysis.
 
 Is this a valid approach? Would someone else find this usefull to?
 
 cheers, Axel
 
 http://www.nabble.com/SolrCore%2C-reload%2C-synonyms-not-reloaded-td19339767.h
 tml
 Multiple Solr Cores
 http://www.nabble.com/Re%3A-Is-it-possible-to-add-synonyms-run-time--td150891
 11.html
 Re: Is it possible to add synonyms run time? 



RE: deleting record from the index using deleteByQuery method

2008-09-23 Thread Kashyap, Raghu
Thanks for your response Chris.

I do see the reviewid in the index through luke. I guess what I am
confused about is the field cumulative_delete. Does this have any
significance to whether the delete was a success or not? Also shouldn't
the method deleteByQuery return a diff status code based on if the
delete was successful or not?

-Raghu 


-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
Sent: Monday, September 22, 2008 11:30 PM
To: solr-user@lucene.apache.org
Subject: Re: deleting record from the index using deleteByQuery method


:  I am trying to delete a record from the index using SolrJ. When I
: execute it I get a status of 0 which means success. I see that the
: cummulative_deletbyquery count increases by 1 and also the commit
: count increases by one. I don't see any decrease on the numDocs
count.
: When I query it back I do see that record again. 

I'm not positive, but i don't think deleting by query will error if no 
documents matched the query -- so just because it succeeds doesn't mean
it 
actually deleted anything ... are you sure 'rev.id: + reviewId'
matches 
on the document you are trying to delete?  does that search find it
using 
the default handler?  (is there any analyzer weirdness?)



-Hoss

If you are not the intended recipient of this e-mail message, please notify the 
sender 
and delete all copies immediately. The sender believes this message and any 
attachments 
were sent free of any virus, worm, Trojan horse, and other forms of malicious 
code. 
This message and its attachments could have been infected during transmission. 
The 
recipient opens any attachments at the recipient's own risk, and in so doing, 
the 
recipient accepts full responsibility for such actions and agrees to take 
protective 
and remedial action relating to any malicious code. Travelport is not liable 
for any 
loss or damage arising from this message or its attachments.




DataImport troubleshooting

2008-09-23 Thread KyleMorrison

I have searched the forum and the internet at large to find an answer to my
simple problem, but have been unable. I am trying to get a simple dataimport
to work, and have not been able to. I have Solr installed on an Apache
server on Unix. I am able to commit and search for files using the usual
Simple* tools. These files begin with add... and so on.

On the data import, I have inserted
  requestHandler name=/dataimport
class=org.apache.solr.handler.dataimport.DataImportHandler
lst name=defaults
  str
name=config/R1/home/shoshana/kyle/Documents/data-config.xml/str  
/lst
  /requestHandler

into solrconfig, and the data import looks like this:
dataConfig
dataSource type=FileDataSource
baseUrl=http://helix.ccb.sickkids.ca:8080/; encoding=UTF-8 /
document
entity name=page processor=XPathEntityProcessor stream=true
forEach=/iProClassDatabase/iProClassEntry/
url=/R1/home/shoshana/kyle/Documents/exampleIproResult.xml
field column=UniProtKB_Accession
xpath=/iProClassDatabase/iProClassEntry/GENERAL_INFORMATION/Protein_Name_and_ID/UniProtKB/UniProtKB_Accession
field column=Nomenclature
xpath=/iProClassDatabase/iProClassEntry/CROSS_REFERENCES/Enzyme_Function/EC/Nomenclature
/
field column=PMID
xpath=/iProClassDatabase/iProClassEntry/CROSS_REFERENCES/Bibliography/References/PMID
/
field column=Sequence_Length
xpath=/iProClassDatabase/iProClassEntry/SEQUENCE/Sequence_Length /
/entity
/document
/dataConfig 

I apologize for the ugly xml. Nonetheless, when I go to
http://host:8080/solr/dataimport, I get a 404, and when I go to
http://host:8080/solr/admin/dataimport.jsp and try to debug, nothing
happens. I have editted out the host name because I don't know if the
employer would be ok with it. Any guidance?

Thanks in advance,
Kyle
-- 
View this message in context: 
http://www.nabble.com/DataImport-troubleshooting-tp19630990p19630990.html
Sent from the Solr - User mailing list archive at Nabble.com.



SolrUpdateServlet Warning

2008-09-23 Thread Gregg
I've got a small configuration question. When posting docs via SolrJ, I get
the following warning in the Solr logs:

WARNING: The @Deprecated SolrUpdateServlet does not accept query parameters:
wt=xmlversion=2.2
  If you are using solrj, make sure to register a request handler to /update
rather then use this servlet.
  Add: requestHandler name=/update class=solr.XmlUpdateRequestHandler 
to your solrconfig.xml

I have an update handler configured in solrconfig.xml as follows:

requestHandler name=/update class=solr.XmlUpdateRequestHandler /

What's the preferred solution? Should I comment out the SolrUpdateServlet in
solr's web.xml? My Solr server is running at /solr, if that helps.

Thanks.

Gregg


Re: SolrUpdateServlet Warning

2008-09-23 Thread Ryan McKinley


On Sep 23, 2008, at 12:35 PM, Gregg wrote:

I've got a small configuration question. When posting docs via  
SolrJ, I get

the following warning in the Solr logs:

WARNING: The @Deprecated SolrUpdateServlet does not accept query  
parameters:

wt=xmlversion=2.2
 If you are using solrj, make sure to register a request handler to / 
update

rather then use this servlet.
 Add: requestHandler name=/update  
class=solr.XmlUpdateRequestHandler 

to your solrconfig.xml

I have an update handler configured in solrconfig.xml as follows:

requestHandler name=/update class=solr.XmlUpdateRequestHandler /



are you sure?

check http://localhost:8983/solr/admin/stats.jsp
and search for XmlUpdateRequestHandler
make sure it is registered to /update


What's the preferred solution? Should I comment out the  
SolrUpdateServlet in

solr's web.xml? My Solr server is running at /solr, if that helps.



that will definitely work, but it should not be necessary to crack  
open the .war file.



ryan


Re: DataImport troubleshooting

2008-09-23 Thread Shalin Shekhar Mangar
Are there any exceptions in the log file when you start Solr?

On Tue, Sep 23, 2008 at 9:31 PM, KyleMorrison [EMAIL PROTECTED] wrote:


 I have searched the forum and the internet at large to find an answer to my
 simple problem, but have been unable. I am trying to get a simple
 dataimport
 to work, and have not been able to. I have Solr installed on an Apache
 server on Unix. I am able to commit and search for files using the usual
 Simple* tools. These files begin with add... and so on.

 On the data import, I have inserted
  requestHandler name=/dataimport
 class=org.apache.solr.handler.dataimport.DataImportHandler
lst name=defaults
  str
 name=config/R1/home/shoshana/kyle/Documents/data-config.xml/str
/lst
  /requestHandler

 into solrconfig, and the data import looks like this:
 dataConfig
dataSource type=FileDataSource
 baseUrl=http://helix.ccb.sickkids.ca:8080/; encoding=UTF-8 /
document
entity name=page processor=XPathEntityProcessor stream=true
 forEach=/iProClassDatabase/iProClassEntry/
 url=/R1/home/shoshana/kyle/Documents/exampleIproResult.xml
field column=UniProtKB_Accession

 xpath=/iProClassDatabase/iProClassEntry/GENERAL_INFORMATION/Protein_Name_and_ID/UniProtKB/UniProtKB_Accession
field column=Nomenclature

 xpath=/iProClassDatabase/iProClassEntry/CROSS_REFERENCES/Enzyme_Function/EC/Nomenclature
 /
field column=PMID

 xpath=/iProClassDatabase/iProClassEntry/CROSS_REFERENCES/Bibliography/References/PMID
 /
field column=Sequence_Length
 xpath=/iProClassDatabase/iProClassEntry/SEQUENCE/Sequence_Length /
/entity
/document
 /dataConfig

 I apologize for the ugly xml. Nonetheless, when I go to
 http://host:8080/solr/dataimport, I get a 404, and when I go to
 http://host:8080/solr/admin/dataimport.jsp and try to debug, nothing
 happens. I have editted out the host name because I don't know if the
 employer would be ok with it. Any guidance?

 Thanks in advance,
 Kyle
 --
 View this message in context:
 http://www.nabble.com/DataImport-troubleshooting-tp19630990p19630990.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
Shalin Shekhar Mangar.


Re: Precision issue with sum() function

2008-09-23 Thread water4u99

Problem with the span filter - removing some test - re-posting.

water4u99 wrote:
 
 Hi,
 
 Some additional clue as to where the issue is: the computed number changed
 when there is an additional query it in the query request.
 
 Ex1: .../select/?q=_val_:%22sum(stockPrice_f,10.00)%22fl=*,score
 This yields a correct answer - 38.0 where the stockPrice_f dynamic field
 has the value of 28.0
 
 However, when there is another query term - the answer changes.
 Ex2:
 .../select/?q=PRICE_MIN:20%20_val_:%22sum(stockPrice_f,10.00)%22fl=*,score
 
 This yields an incorrect answer: 36.41818
 
 The config is straight out of the examples/ directory with only my own
 field definitions.
 
 Thanks if anyone can explain or help.
 
 
 
 
 water4u99 wrote:
 
 Hi,
 
 I have indexed a dynamic field in the adddoc as: field
 name=stockPrice_f28.00/field.
 It is visible in my query.
 However, when I issue a query with a function: ...
 _val_:sum(stockPrice_f, 10.00)fl=*,score
 I received the output of: float name=score36.41818/float
 There were no other computations.
 
 Can any one help on why the answer is off.
 
 Thank you.
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Precision-issue-with-sum%28%29-function-tp19616287p19633206.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: How to use copyfield with dynamicfield?

2008-09-23 Thread Erik Hatcher

Simply set text to be multivalued (one for each *_t field).

Erik

On Sep 22, 2008, at 1:08 PM, Jon Drukman wrote:


I have a dynamicField declaration:

dynamicField name=*_t type=text indexed=true stored=true/


I want to copy any *_t's into a text field for searching with  
dismax. As it is, it appears you can't search dynamicfields this way.


I tried adding a copyField:

copyField source=*_t dest=text/

I do have a text field in my schema:
field name=text type=text indexed=true stored=true/


However I get 400 errors whenever I try to update a record with  
entries in the *_t.



INFO: /update  0 2
Sep 22, 2008 10:04:40 AM org.apache.solr.core.SolrException log
SEVERE: org.apache.solr.core.SolrException: ERROR: multiple values  
encountered for non multiValued field text: first='Centennial Dr,  
Oakland, CA' second=''
at  
org 
.apache 
.solr.update.DocumentBuilder.addSingleField(DocumentBuilder.java:62)


I'm going to guess that the copyField with a wildcard is not  
allowed. If that is true, how does one deal with the situation where  
you want to allow new fields AND have them searchable?


-jsd-




Re: DataImport troubleshooting

2008-09-23 Thread KyleMorrison

Thank you for help. The problem was actually just stupidity on my part, as it
seems I was running the wrong startup and shutdown shells for the server,
and thus the server was getting restarted. I restarted the server and I can
at least access those pages. I'm getting some wonky output, but I assume
this will be sorted out.

Kyle



Shalin Shekhar Mangar wrote:
 
 Are there any exceptions in the log file when you start Solr?
 
 On Tue, Sep 23, 2008 at 9:31 PM, KyleMorrison [EMAIL PROTECTED] wrote:
 

 I have searched the forum and the internet at large to find an answer to
 my
 simple problem, but have been unable. I am trying to get a simple
 dataimport
 to work, and have not been able to. I have Solr installed on an Apache
 server on Unix. I am able to commit and search for files using the usual
 Simple* tools. These files begin with add... and so on.

 On the data import, I have inserted
  requestHandler name=/dataimport
 class=org.apache.solr.handler.dataimport.DataImportHandler
lst name=defaults
  str
 name=config/R1/home/shoshana/kyle/Documents/data-config.xml/str
/lst
  /requestHandler

 into solrconfig, and the data import looks like this:
 dataConfig
dataSource type=FileDataSource
 baseUrl=http://helix.ccb.sickkids.ca:8080/; encoding=UTF-8 /
document
entity name=page processor=XPathEntityProcessor stream=true
 forEach=/iProClassDatabase/iProClassEntry/
 url=/R1/home/shoshana/kyle/Documents/exampleIproResult.xml
field column=UniProtKB_Accession

 xpath=/iProClassDatabase/iProClassEntry/GENERAL_INFORMATION/Protein_Name_and_ID/UniProtKB/UniProtKB_Accession
field column=Nomenclature

 xpath=/iProClassDatabase/iProClassEntry/CROSS_REFERENCES/Enzyme_Function/EC/Nomenclature
 /
field column=PMID

 xpath=/iProClassDatabase/iProClassEntry/CROSS_REFERENCES/Bibliography/References/PMID
 /
field column=Sequence_Length
 xpath=/iProClassDatabase/iProClassEntry/SEQUENCE/Sequence_Length /
/entity
/document
 /dataConfig

 I apologize for the ugly xml. Nonetheless, when I go to
 http://host:8080/solr/dataimport, I get a 404, and when I go to
 http://host:8080/solr/admin/dataimport.jsp and try to debug, nothing
 happens. I have editted out the host name because I don't know if the
 employer would be ok with it. Any guidance?

 Thanks in advance,
 Kyle
 --
 View this message in context:
 http://www.nabble.com/DataImport-troubleshooting-tp19630990p19630990.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 

-- 
View this message in context: 
http://www.nabble.com/DataImport-troubleshooting-tp19630990p19635170.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: SolrUpdateServlet Warning

2008-09-23 Thread Gregg
This turned out to be a fairly pedestrian bug on my part: I had /update
appended to the Solr base URL when I was adding docs via SolrJ.

Thanks for the help.

--Gregg

On Tue, Sep 23, 2008 at 12:42 PM, Ryan McKinley [EMAIL PROTECTED] wrote:


 On Sep 23, 2008, at 12:35 PM, Gregg wrote:

  I've got a small configuration question. When posting docs via SolrJ, I
 get
 the following warning in the Solr logs:

 WARNING: The @Deprecated SolrUpdateServlet does not accept query
 parameters:
 wt=xmlversion=2.2
  If you are using solrj, make sure to register a request handler to
 /update
 rather then use this servlet.
  Add: requestHandler name=/update class=solr.XmlUpdateRequestHandler
 
 to your solrconfig.xml

 I have an update handler configured in solrconfig.xml as follows:

 requestHandler name=/update class=solr.XmlUpdateRequestHandler /


 are you sure?

 check http://localhost:8983/solr/admin/stats.jsp
 and search for XmlUpdateRequestHandler
 make sure it is registered to /update


  What's the preferred solution? Should I comment out the SolrUpdateServlet
 in
 solr's web.xml? My Solr server is running at /solr, if that helps.


 that will definitely work, but it should not be necessary to crack open the
 .war file.


 ryan



Highlight Fragments

2008-09-23 Thread David Snelling
Ok, I'm very frustrated. I've tried every configuraiton I can and parameters
and I cannot get fragments to show up in the highlighting in solr. (no
fragments at the bottom or highlights em/em in the text. I must be
missing something but I'm just not sure what it is.

/select/?qt=standardq=crayonhl=truehl.fl=synopsis,shortdescriptionhl.fragmenter=gaphl.snippets=3debugQuery=true

And I get highlight segment, but no fragments or phrase highlighting.

My goal - if I'm doing this completely wrong - is to get google like
snippets of text around the query term (or at mimimum to highlight the query
term itself).

Results:
lst name=params
str name=flsynopsis/str
str name=debugQuerytrue/str
str name=hl.snippets3/str
str name=hl.fragmentergap/str
str name=qcrayon/str
str name=hl.flsynopsis/str
str name=qtstandard/str
str name=hltrue/str
str name=version2.1/str
/lst
/responseHeader
−
result name=response numFound=237 start=0
−
...
.
..
/result
lst name=highlighting
lst name=206738/
lst name=184583/
lst name=203809/
lst name=201588/
lst name=207554/
lst name=157569/
lst name=199682/
lst name=196359/
lst name=202940/
lst name=190672/
/lst

-- 
hic sunt dracones


Re: Highlight Fragments

2008-09-23 Thread wojtekpia

Make sure the fields you're trying to highlight are stored in your schema
(e.g. field name=synopsis type=string stored=true /)



David Snelling-2 wrote:
 
 Ok, I'm very frustrated. I've tried every configuraiton I can and
 parameters
 and I cannot get fragments to show up in the highlighting in solr. (no
 fragments at the bottom or highlights em/em in the text. I must be
 missing something but I'm just not sure what it is.
 
 /select/?qt=standardq=crayonhl=truehl.fl=synopsis,shortdescriptionhl.fragmenter=gaphl.snippets=3debugQuery=true
 
 And I get highlight segment, but no fragments or phrase highlighting.
 
 My goal - if I'm doing this completely wrong - is to get google like
 snippets of text around the query term (or at mimimum to highlight the
 query
 term itself).
 
 Results:
 lst name=params
 str name=flsynopsis/str
 str name=debugQuerytrue/str
 str name=hl.snippets3/str
 str name=hl.fragmentergap/str
 str name=qcrayon/str
 str name=hl.flsynopsis/str
 str name=qtstandard/str
 str name=hltrue/str
 str name=version2.1/str
 /lst
 /responseHeader
 −
 result name=response numFound=237 start=0
 −
 ...
 .
 ..
 /result
 lst name=highlighting
 lst name=206738/
 lst name=184583/
 lst name=203809/
 lst name=201588/
 lst name=207554/
 lst name=157569/
 lst name=199682/
 lst name=196359/
 lst name=202940/
 lst name=190672/
 /lst
 
 -- 
 hic sunt dracones
 
 

-- 
View this message in context: 
http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Highlight Fragments

2008-09-23 Thread David Snelling
This is the configuration for the two fields I have tried on

field name=shortdescription type=string indexed=true stored=true/
field name=synopsis type=string indexed=true stored=true
compressed=true/



On Tue, Sep 23, 2008 at 1:59 PM, wojtekpia [EMAIL PROTECTED] wrote:


 Make sure the fields you're trying to highlight are stored in your schema
 (e.g. field name=synopsis type=string stored=true /)



 David Snelling-2 wrote:
 
  Ok, I'm very frustrated. I've tried every configuraiton I can and
  parameters
  and I cannot get fragments to show up in the highlighting in solr. (no
  fragments at the bottom or highlights em/em in the text. I must be
  missing something but I'm just not sure what it is.
 
 
 /select/?qt=standardq=crayonhl=truehl.fl=synopsis,shortdescriptionhl.fragmenter=gaphl.snippets=3debugQuery=true
 
  And I get highlight segment, but no fragments or phrase highlighting.
 
  My goal - if I'm doing this completely wrong - is to get google like
  snippets of text around the query term (or at mimimum to highlight the
  query
  term itself).
 
  Results:
  lst name=params
  str name=flsynopsis/str
  str name=debugQuerytrue/str
  str name=hl.snippets3/str
  str name=hl.fragmentergap/str
  str name=qcrayon/str
  str name=hl.flsynopsis/str
  str name=qtstandard/str
  str name=hltrue/str
  str name=version2.1/str
  /lst
  /responseHeader
  −
  result name=response numFound=237 start=0
  −
  ...
  .
  ..
  /result
  lst name=highlighting
  lst name=206738/
  lst name=184583/
  lst name=203809/
  lst name=201588/
  lst name=207554/
  lst name=157569/
  lst name=199682/
  lst name=196359/
  lst name=202940/
  lst name=190672/
  /lst
 
  --
  hic sunt dracones
 
 

 --
 View this message in context:
 http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
hic sunt dracones


Re: Highlight Fragments

2008-09-23 Thread wojtekpia

Try a query where you're sure to get something to highlight in one of your
highlight fields, for example:

/select/?qt=standardq=synopsis:crayonhl=truehl.fl=synopsis,shortdescription



David Snelling-2 wrote:
 
 This is the configuration for the two fields I have tried on
 
 field name=shortdescription type=string indexed=true
 stored=true/
 field name=synopsis type=string indexed=true stored=true
 compressed=true/
 
 
 
 On Tue, Sep 23, 2008 at 1:59 PM, wojtekpia [EMAIL PROTECTED] wrote:
 

 Make sure the fields you're trying to highlight are stored in your schema
 (e.g. field name=synopsis type=string stored=true /)



 David Snelling-2 wrote:
 
  Ok, I'm very frustrated. I've tried every configuraiton I can and
  parameters
  and I cannot get fragments to show up in the highlighting in solr. (no
  fragments at the bottom or highlights em/em in the text. I must be
  missing something but I'm just not sure what it is.
 
 
 /select/?qt=standardq=crayonhl=truehl.fl=synopsis,shortdescriptionhl.fragmenter=gaphl.snippets=3debugQuery=true
 
  And I get highlight segment, but no fragments or phrase highlighting.
 
  My goal - if I'm doing this completely wrong - is to get google like
  snippets of text around the query term (or at mimimum to highlight the
  query
  term itself).
 
  Results:
  lst name=params
  str name=flsynopsis/str
  str name=debugQuerytrue/str
  str name=hl.snippets3/str
  str name=hl.fragmentergap/str
  str name=qcrayon/str
  str name=hl.flsynopsis/str
  str name=qtstandard/str
  str name=hltrue/str
  str name=version2.1/str
  /lst
  /responseHeader
  −
  result name=response numFound=237 start=0
  −
  ...
  .
  ..
  /result
  lst name=highlighting
  lst name=206738/
  lst name=184583/
  lst name=203809/
  lst name=201588/
  lst name=207554/
  lst name=157569/
  lst name=199682/
  lst name=196359/
  lst name=202940/
  lst name=190672/
  /lst
 
  --
  hic sunt dracones
 
 

 --
 View this message in context:
 http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 -- 
 hic sunt dracones
 
 

-- 
View this message in context: 
http://www.nabble.com/Highlight-Fragments-tp19636705p19637261.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: using BoostingTermQuery

2008-09-23 Thread Grant Ingersoll
At this point, it's roll your own.  I'd love to see the BTQ in Solr  
(and Spans!), but I wonder if it makes sense w/o better indexing side  
support.  I assume you are rolling your own Analyzer, right?  Spans  
and payloads are this huge untapped area for better search!



On Sep 23, 2008, at 5:12 PM, Ensdorf Ken wrote:


Hi-

I'm new to Solr, and I'm trying to figure out the best way to  
configure it to use BoostingTermQuery in the scoring mechanism.  Do  
I need to create a custom query parser?  All I want is the default  
parser behavior except to get the custom term boost from the Payload  
data.  Thanks!


-Ken


--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ









Re: using BoostingTermQuery

2008-09-23 Thread Otis Gospodnetic
It may be too early to say this but I'll say it anyway :)
There should be a juicy case study that includes payloads, BTQ, and Spans in 
the upcoming Lucene in Action 2.  I can't wait to see it, personally.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Grant Ingersoll [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Sent: Tuesday, September 23, 2008 5:29:05 PM
 Subject: Re: using BoostingTermQuery
 
 At this point, it's roll your own.  I'd love to see the BTQ in Solr  
 (and Spans!), but I wonder if it makes sense w/o better indexing side  
 support.  I assume you are rolling your own Analyzer, right?  Spans  
 and payloads are this huge untapped area for better search!
 
 
 On Sep 23, 2008, at 5:12 PM, Ensdorf Ken wrote:
 
  Hi-
 
  I'm new to Solr, and I'm trying to figure out the best way to  
  configure it to use BoostingTermQuery in the scoring mechanism.  Do  
  I need to create a custom query parser?  All I want is the default  
  parser behavior except to get the custom term boost from the Payload  
  data.  Thanks!
 
  -Ken
 
 --
 Grant Ingersoll
 http://www.lucidimagination.com
 
 Lucene Helpful Hints:
 http://wiki.apache.org/lucene-java/BasicsOfPerformance
 http://wiki.apache.org/lucene-java/LuceneFAQ



RE: using BoostingTermQuery

2008-09-23 Thread Ensdorf Ken

 At this point, it's roll your own.

That's where I'm getting bogged down - I'm confused by the various queryparser 
classes in lucene and solr and I'm not sure exactly what I need to override.  
Do you know of an example of something similar to what I'm doing that I could 
use as a reference?

 I'd love to see the BTQ in Solr
 (and Spans!), but I wonder if it makes sense w/o better indexing side
 support.  I assume you are rolling your own Analyzer, right?

Yup - I'm pretty sure I have that side figured out.  My input contains terms 
marked up with a score (ie 'software?7')  I just needed to create a TokenFilter 
that parses out the suffix and sets the Payload on the token.

  Spans and payloads are this huge untapped area for better search!

Completely agree - we do a lot with keyword searching, and we use this type of 
thing in our existing search implementation.  Thanks for the quick response!

 On Sep 23, 2008, at 5:12 PM, Ensdorf Ken wrote:

  Hi-
 
  I'm new to Solr, and I'm trying to figure out the best way to
  configure it to use BoostingTermQuery in the scoring mechanism.  Do
  I need to create a custom query parser?  All I want is the default
  parser behavior except to get the custom term boost from the Payload
  data.  Thanks!
 
  -Ken

 --
 Grant Ingersoll
 http://www.lucidimagination.com

 Lucene Helpful Hints:
 http://wiki.apache.org/lucene-java/BasicsOfPerformance
 http://wiki.apache.org/lucene-java/LuceneFAQ










Re: Highlight Fragments

2008-09-23 Thread David Snelling
Hmmm. That doesn't actually return  anything which is odd because I know
it's in the field if I do a query without specifying the field.

http://qasearch.donorschoose.org/select/?q=synopsis:students

returns nothing

http://qasearch.donorschoose.org/select/?q=students

returns items with query in synopsis field.

This may be causing issues but I'm not sure why it's not working. We use
this live and do very complex queries including facets that work fine.

www.donorschoose.org



On Tue, Sep 23, 2008 at 2:20 PM, wojtekpia [EMAIL PROTECTED] wrote:


 Try a query where you're sure to get something to highlight in one of your
 highlight fields, for example:


 /select/?qt=standardq=synopsis:crayonhl=truehl.fl=synopsis,shortdescription



 David Snelling-2 wrote:
 
  This is the configuration for the two fields I have tried on
 
  field name=shortdescription type=string indexed=true
  stored=true/
  field name=synopsis type=string indexed=true stored=true
  compressed=true/
 
 
 
  On Tue, Sep 23, 2008 at 1:59 PM, wojtekpia [EMAIL PROTECTED] wrote:
 
 
  Make sure the fields you're trying to highlight are stored in your
 schema
  (e.g. field name=synopsis type=string stored=true /)
 
 
 
  David Snelling-2 wrote:
  
   Ok, I'm very frustrated. I've tried every configuraiton I can and
   parameters
   and I cannot get fragments to show up in the highlighting in solr. (no
   fragments at the bottom or highlights em/em in the text. I must be
   missing something but I'm just not sure what it is.
  
  
 
 /select/?qt=standardq=crayonhl=truehl.fl=synopsis,shortdescriptionhl.fragmenter=gaphl.snippets=3debugQuery=true
  
   And I get highlight segment, but no fragments or phrase highlighting.
  
   My goal - if I'm doing this completely wrong - is to get google like
   snippets of text around the query term (or at mimimum to highlight the
   query
   term itself).
  
   Results:
   lst name=params
   str name=flsynopsis/str
   str name=debugQuerytrue/str
   str name=hl.snippets3/str
   str name=hl.fragmentergap/str
   str name=qcrayon/str
   str name=hl.flsynopsis/str
   str name=qtstandard/str
   str name=hltrue/str
   str name=version2.1/str
   /lst
   /responseHeader
   −
   result name=response numFound=237 start=0
   −
   ...
   .
   ..
   /result
   lst name=highlighting
   lst name=206738/
   lst name=184583/
   lst name=203809/
   lst name=201588/
   lst name=207554/
   lst name=157569/
   lst name=199682/
   lst name=196359/
   lst name=202940/
   lst name=190672/
   /lst
  
   --
   hic sunt dracones
  
  
 
  --
  View this message in context:
  http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 
  --
  hic sunt dracones
 
 

 --
 View this message in context:
 http://www.nabble.com/Highlight-Fragments-tp19636705p19637261.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
hic sunt dracones


Re: Highlight Fragments

2008-09-23 Thread wojtekpia

Your fields are all of string type. String fields aren't tokenized or
analyzed, so you have to match the entire text of those fields to actually
get a match. Try the following:

/select/?q=firstname:Kathrynhl=onhl.fl=firstname

The reason you're seeing results with just q=students, but not
q=synopsis:students is because you're copying the synopsis field into your
field named 'text', which is of type 'text', which does get tokenized and
analyzed, and 'text' is your default search field.

The reason you don't see any highlights with the following query is because
your 'text' field isn't stored.

select/?q=text:studentshl=onhl.fl=text





David Snelling-2 wrote:
 
 Hmmm. That doesn't actually return  anything which is odd because I know
 it's in the field if I do a query without specifying the field.
 
 http://qasearch.donorschoose.org/select/?q=synopsis:students
 
 returns nothing
 
 http://qasearch.donorschoose.org/select/?q=students
 
 returns items with query in synopsis field.
 
 This may be causing issues but I'm not sure why it's not working. We use
 this live and do very complex queries including facets that work fine.
 
 www.donorschoose.org
 
 
 
 On Tue, Sep 23, 2008 at 2:20 PM, wojtekpia [EMAIL PROTECTED] wrote:
 

 Try a query where you're sure to get something to highlight in one of
 your
 highlight fields, for example:


 /select/?qt=standardq=synopsis:crayonhl=truehl.fl=synopsis,shortdescription



 David Snelling-2 wrote:
 
  This is the configuration for the two fields I have tried on
 
  field name=shortdescription type=string indexed=true
  stored=true/
  field name=synopsis type=string indexed=true stored=true
  compressed=true/
 
 
 
  On Tue, Sep 23, 2008 at 1:59 PM, wojtekpia [EMAIL PROTECTED]
 wrote:
 
 
  Make sure the fields you're trying to highlight are stored in your
 schema
  (e.g. field name=synopsis type=string stored=true /)
 
 
 
  David Snelling-2 wrote:
  
   Ok, I'm very frustrated. I've tried every configuraiton I can and
   parameters
   and I cannot get fragments to show up in the highlighting in solr.
 (no
   fragments at the bottom or highlights em/em in the text. I must
 be
   missing something but I'm just not sure what it is.
  
  
 
 /select/?qt=standardq=crayonhl=truehl.fl=synopsis,shortdescriptionhl.fragmenter=gaphl.snippets=3debugQuery=true
  
   And I get highlight segment, but no fragments or phrase
 highlighting.
  
   My goal - if I'm doing this completely wrong - is to get google like
   snippets of text around the query term (or at mimimum to highlight
 the
   query
   term itself).
  
   Results:
   lst name=params
   str name=flsynopsis/str
   str name=debugQuerytrue/str
   str name=hl.snippets3/str
   str name=hl.fragmentergap/str
   str name=qcrayon/str
   str name=hl.flsynopsis/str
   str name=qtstandard/str
   str name=hltrue/str
   str name=version2.1/str
   /lst
   /responseHeader
   −
   result name=response numFound=237 start=0
   −
   ...
   .
   ..
   /result
   lst name=highlighting
   lst name=206738/
   lst name=184583/
   lst name=203809/
   lst name=201588/
   lst name=207554/
   lst name=157569/
   lst name=199682/
   lst name=196359/
   lst name=202940/
   lst name=190672/
   /lst
  
   --
   hic sunt dracones
  
  
 
  --
  View this message in context:
  http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 
  --
  hic sunt dracones
 
 

 --
 View this message in context:
 http://www.nabble.com/Highlight-Fragments-tp19636705p19637261.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 -- 
 hic sunt dracones
 
 

-- 
View this message in context: 
http://www.nabble.com/Highlight-Fragments-tp19636705p19637801.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Highlight Fragments

2008-09-23 Thread David Snelling
Ok, thanks, that makes a lot of sense now.
So, how should I be storing the text for the synopsis or shortdescription
fields so it would be tokenized? Should it be text instead of string?


Thank you very much for the help by the way.


On Tue, Sep 23, 2008 at 2:49 PM, wojtekpia [EMAIL PROTECTED] wrote:


 Your fields are all of string type. String fields aren't tokenized or
 analyzed, so you have to match the entire text of those fields to actually
 get a match. Try the following:

 /select/?q=firstname:Kathrynhl=onhl.fl=firstname

 The reason you're seeing results with just q=students, but not
 q=synopsis:students is because you're copying the synopsis field into your
 field named 'text', which is of type 'text', which does get tokenized and
 analyzed, and 'text' is your default search field.

 The reason you don't see any highlights with the following query is because
 your 'text' field isn't stored.

 select/?q=text:studentshl=onhl.fl=text





 David Snelling-2 wrote:
 
  Hmmm. That doesn't actually return  anything which is odd because I know
  it's in the field if I do a query without specifying the field.
 
  http://qasearch.donorschoose.org/select/?q=synopsis:students
 
  returns nothing
 
  http://qasearch.donorschoose.org/select/?q=students
 
  returns items with query in synopsis field.
 
  This may be causing issues but I'm not sure why it's not working. We use
  this live and do very complex queries including facets that work fine.
 
  www.donorschoose.org
 
 
 
  On Tue, Sep 23, 2008 at 2:20 PM, wojtekpia [EMAIL PROTECTED] wrote:
 
 
  Try a query where you're sure to get something to highlight in one of
  your
  highlight fields, for example:
 
 
 
 /select/?qt=standardq=synopsis:crayonhl=truehl.fl=synopsis,shortdescription
 
 
 
  David Snelling-2 wrote:
  
   This is the configuration for the two fields I have tried on
  
   field name=shortdescription type=string indexed=true
   stored=true/
   field name=synopsis type=string indexed=true stored=true
   compressed=true/
  
  
  
   On Tue, Sep 23, 2008 at 1:59 PM, wojtekpia [EMAIL PROTECTED]
  wrote:
  
  
   Make sure the fields you're trying to highlight are stored in your
  schema
   (e.g. field name=synopsis type=string stored=true /)
  
  
  
   David Snelling-2 wrote:
   
Ok, I'm very frustrated. I've tried every configuraiton I can and
parameters
and I cannot get fragments to show up in the highlighting in solr.
  (no
fragments at the bottom or highlights em/em in the text. I must
  be
missing something but I'm just not sure what it is.
   
   
  
 
 /select/?qt=standardq=crayonhl=truehl.fl=synopsis,shortdescriptionhl.fragmenter=gaphl.snippets=3debugQuery=true
   
And I get highlight segment, but no fragments or phrase
  highlighting.
   
My goal - if I'm doing this completely wrong - is to get google
 like
snippets of text around the query term (or at mimimum to highlight
  the
query
term itself).
   
Results:
lst name=params
str name=flsynopsis/str
str name=debugQuerytrue/str
str name=hl.snippets3/str
str name=hl.fragmentergap/str
str name=qcrayon/str
str name=hl.flsynopsis/str
str name=qtstandard/str
str name=hltrue/str
str name=version2.1/str
/lst
/responseHeader
−
result name=response numFound=237 start=0
−
...
.
..
/result
lst name=highlighting
lst name=206738/
lst name=184583/
lst name=203809/
lst name=201588/
lst name=207554/
lst name=157569/
lst name=199682/
lst name=196359/
lst name=202940/
lst name=190672/
/lst
   
--
hic sunt dracones
   
   
  
   --
   View this message in context:
   http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html
   Sent from the Solr - User mailing list archive at Nabble.com.
  
  
  
  
   --
   hic sunt dracones
  
  
 
  --
  View this message in context:
  http://www.nabble.com/Highlight-Fragments-tp19636705p19637261.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 
  --
  hic sunt dracones
 
 

 --
 View this message in context:
 http://www.nabble.com/Highlight-Fragments-tp19636705p19637801.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
hic sunt dracones


Re: Highlight Fragments

2008-09-23 Thread wojtekpia

Yes, you can use text (or some custom derivative of it) for your fields. 


David Snelling-2 wrote:
 
 Ok, thanks, that makes a lot of sense now.
 So, how should I be storing the text for the synopsis or shortdescription
 fields so it would be tokenized? Should it be text instead of string?
 
 
 Thank you very much for the help by the way.
 
 
 On Tue, Sep 23, 2008 at 2:49 PM, wojtekpia [EMAIL PROTECTED] wrote:
 

 Your fields are all of string type. String fields aren't tokenized or
 analyzed, so you have to match the entire text of those fields to
 actually
 get a match. Try the following:

 /select/?q=firstname:Kathrynhl=onhl.fl=firstname

 The reason you're seeing results with just q=students, but not
 q=synopsis:students is because you're copying the synopsis field into
 your
 field named 'text', which is of type 'text', which does get tokenized and
 analyzed, and 'text' is your default search field.

 The reason you don't see any highlights with the following query is
 because
 your 'text' field isn't stored.

 select/?q=text:studentshl=onhl.fl=text





 David Snelling-2 wrote:
 
  Hmmm. That doesn't actually return  anything which is odd because I
 know
  it's in the field if I do a query without specifying the field.
 
  http://qasearch.donorschoose.org/select/?q=synopsis:students
 
  returns nothing
 
  http://qasearch.donorschoose.org/select/?q=students
 
  returns items with query in synopsis field.
 
  This may be causing issues but I'm not sure why it's not working. We
 use
  this live and do very complex queries including facets that work fine.
 
  www.donorschoose.org
 
 
 
  On Tue, Sep 23, 2008 at 2:20 PM, wojtekpia [EMAIL PROTECTED]
 wrote:
 
 
  Try a query where you're sure to get something to highlight in one of
  your
  highlight fields, for example:
 
 
 
 /select/?qt=standardq=synopsis:crayonhl=truehl.fl=synopsis,shortdescription
 
 
 
  David Snelling-2 wrote:
  
   This is the configuration for the two fields I have tried on
  
   field name=shortdescription type=string indexed=true
   stored=true/
   field name=synopsis type=string indexed=true stored=true
   compressed=true/
  
  
  
   On Tue, Sep 23, 2008 at 1:59 PM, wojtekpia [EMAIL PROTECTED]
  wrote:
  
  
   Make sure the fields you're trying to highlight are stored in your
  schema
   (e.g. field name=synopsis type=string stored=true /)
  
  
  
   David Snelling-2 wrote:
   
Ok, I'm very frustrated. I've tried every configuraiton I can and
parameters
and I cannot get fragments to show up in the highlighting in
 solr.
  (no
fragments at the bottom or highlights em/em in the text. I
 must
  be
missing something but I'm just not sure what it is.
   
   
  
 
 /select/?qt=standardq=crayonhl=truehl.fl=synopsis,shortdescriptionhl.fragmenter=gaphl.snippets=3debugQuery=true
   
And I get highlight segment, but no fragments or phrase
  highlighting.
   
My goal - if I'm doing this completely wrong - is to get google
 like
snippets of text around the query term (or at mimimum to
 highlight
  the
query
term itself).
   
Results:
lst name=params
str name=flsynopsis/str
str name=debugQuerytrue/str
str name=hl.snippets3/str
str name=hl.fragmentergap/str
str name=qcrayon/str
str name=hl.flsynopsis/str
str name=qtstandard/str
str name=hltrue/str
str name=version2.1/str
/lst
/responseHeader
−
result name=response numFound=237 start=0
−
...
.
..
/result
lst name=highlighting
lst name=206738/
lst name=184583/
lst name=203809/
lst name=201588/
lst name=207554/
lst name=157569/
lst name=199682/
lst name=196359/
lst name=202940/
lst name=190672/
/lst
   
--
hic sunt dracones
   
   
  
   --
   View this message in context:
   http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html
   Sent from the Solr - User mailing list archive at Nabble.com.
  
  
  
  
   --
   hic sunt dracones
  
  
 
  --
  View this message in context:
  http://www.nabble.com/Highlight-Fragments-tp19636705p19637261.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 
  --
  hic sunt dracones
 
 

 --
 View this message in context:
 http://www.nabble.com/Highlight-Fragments-tp19636705p19637801.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 -- 
 hic sunt dracones
 
 

-- 
View this message in context: 
http://www.nabble.com/Highlight-Fragments-tp19636705p19638296.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: using BoostingTermQuery

2008-09-23 Thread Grant Ingersoll


On Sep 23, 2008, at 5:39 PM, Ensdorf Ken wrote:




At this point, it's roll your own.


That's where I'm getting bogged down - I'm confused by the various  
queryparser classes in lucene and solr and I'm not sure exactly what  
I need to override.  Do you know of an example of something similar  
to what I'm doing that I could use as a reference?


I'm no QueryParser expert, but I would probably start w/ the default  
query parser in Solr (LuceneQParser), and then progress a bit to the  
DisMax one.  I'd ask specific questions based on what you see there.   
If you get far enough along, you may consider asking for help on the  
java-user list as well.






I'd love to see the BTQ in Solr
(and Spans!), but I wonder if it makes sense w/o better indexing side
support.  I assume you are rolling your own Analyzer, right?


Yup - I'm pretty sure I have that side figured out.  My input  
contains terms marked up with a score (ie 'software?7')  I just  
needed to create a TokenFilter that parses out the suffix and sets  
the Payload on the token.


Cool.  Patch?





Spans and payloads are this huge untapped area for better search!


Completely agree - we do a lot with keyword searching, and we use  
this type of thing in our existing search implementation.  Thanks  
for the quick response!



On Sep 23, 2008, at 5:12 PM, Ensdorf Ken wrote:


Hi-

I'm new to Solr, and I'm trying to figure out the best way to
configure it to use BoostingTermQuery in the scoring mechanism.  Do
I need to create a custom query parser?  All I want is the default
parser behavior except to get the custom term boost from the Payload
data.  Thanks!

-Ken


--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ










--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ









Re: Snappuller taking up CPU on master

2008-09-23 Thread Otis Gospodnetic
Hi,

Can't tell with certainty without looking, but my guess would be slow disk, 
high IO, and a large number of processes waiting for IO (run vmstat and look at 
the wa column).

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: rahul_k123 [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Sent: Tuesday, September 23, 2008 6:56:48 PM
 Subject: Snappuller taking up CPU on master
 
 
 Hi,
 
 I am using snappuller to sync my slave with master, i am not using rsync
 daemon, i am doing Rsync using remote shell.
 
 When i am serving requests from the master when the snappuller is running
 (after optimization, total index is arnd 4 gb it doing the transfer of whole
 index), the performance is very bad actually causing timeouts.
 
 
 
 Any ideas why this happens .
 
 
 Any suggestions will help.
 
 
 Thanks.
 -- 
 View this message in context: 
 http://www.nabble.com/Snappuller-taking-up-CPU-on-master-tp19638474p19638474.html
 Sent from the Solr - User mailing list archive at Nabble.com.



solr score

2008-09-23 Thread sanraj25

hi,
  How to weightage more frequently searched word in solr?

what is the functionality in Apache solr module?
I have a list of more frequently searched word in my site , i need to
highlight those words.From the net i found out that 'score' is used for this
purpose. Isn't it true?
Anybody knows about it?
Please help me. 

with Regards,
Santhanaraj R

-- 
View this message in context: 
http://www.nabble.com/solr-score-tp19642046p19642046.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Snappuller taking up CPU on master

2008-09-23 Thread rahul_k123

Hi,

Thanks for the reply.

I am not using SOLR for indexing and serving search requests, i am using
only the scripts for replication.

Yes it looks like I/O, but my question is how to handle this problem and is
there any optimal way to achieve this.


Thanks.




Otis Gospodnetic wrote:
 
 Hi,
 
 Can't tell with certainty without looking, but my guess would be slow
 disk, high IO, and a large number of processes waiting for IO (run vmstat
 and look at the wa column).
 
 Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
 - Original Message 
 From: rahul_k123 [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Sent: Tuesday, September 23, 2008 6:56:48 PM
 Subject: Snappuller taking up CPU on master
 
 
 Hi,
 
 I am using snappuller to sync my slave with master, i am not using rsync
 daemon, i am doing Rsync using remote shell.
 
 When i am serving requests from the master when the snappuller is running
 (after optimization, total index is arnd 4 gb it doing the transfer of
 whole
 index), the performance is very bad actually causing timeouts.
 
 
 
 Any ideas why this happens .
 
 
 Any suggestions will help.
 
 
 Thanks.
 -- 
 View this message in context: 
 http://www.nabble.com/Snappuller-taking-up-CPU-on-master-tp19638474p19638474.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Snappuller-taking-up-CPU-on-master-tp19638474p19642053.html
Sent from the Solr - User mailing list archive at Nabble.com.