Re: character encoding issue

2009-11-04 Thread Jonathan Hendler

Hi Peter,

I have the same set of issues and will look for a response here.

Sometimes those other chars can be create at the time of input (like  
extraction from a Microsoft Office doc from third part tool for  
example). But MySQL looking OK in the browser might be because the  
encoding of MySQL was not the same as the original text. Say for  
example that the collation of MySQL is Latin, and the document was  
UTF-8. When a browser renders, it might assume chars are UTF-8, but  
SOLR might be taking the table type literally in the DIH (Latin1  
Swedish for example). Could also be the way PHP doesn't handle UTF-8  
well and it depends on your client.


Don't think it has anything to do with Jetty - I use Resin.

Hope that helps,

- Jonathan


On Nov 4, 2009, at 8:48 AM, Peter Hedlund wrote:

I'm having a problem with character encoding.  The data that I'm  
indexing with SOLR is being pulled from a MySQL database and then  
the index is being integrated into a PHP application.  When I  
display the text from the SOLR index it's full of strange characters  
(–, é, etc...).  However, when I bypass SOLR and access the data  
from the MySQL table directly and write to the browser I don't see  
any problems with em-dashes and accented characters.


Is this a JETTY issue or a SOLR issue or something else?  (It's not  
simply an issue of including meta http-equiv=Content-Type  
content=text/html;charset=UTF-8 either)


Thanks for any help.

Peter Hedlund






Bug with DIH and MySQL CONCAT()?

2009-11-04 Thread Jonathan Hendler

Hi All,

I have an SQL query that begins with SELECT CONCAT (  'ID',  
Subject.id  , ':' , Subject.name , ':L', Subject.level) as  
subject_name and the query runs great against MySQL from the command  
line.
Since this is a nested entity, the schema.xml contains field  
name=subject_name type=string indexed=true stored=true  
multiValued=true /


After a full-import, a select output of the xml looks like

arr name=subject_name
str[...@1db4c43/str
str[...@6bcef1/str
str[...@1df503b/str
str[...@c5dbb/str
str[...@1ddc3ea/str
str[...@6963b0/str
str[...@10fe215/str
...


Without a CONCAT - it works fine.

Is this a bug?

Meanwhile - should I go about concatenating some where else in the DIH  
config?


Thanks.

- Jonathan




Re: solr query help alpha numeric and not

2009-11-04 Thread Jonathan Hendler

Hi Joel,

The ID is sent back as a string (instead of as an integer) in your  
example. Could this be the cause?


- Jonathan

On Nov 4, 2009, at 9:08 AM, Joel Nylund wrote:

Hi, I have a field called firstLetterTitle, this field has 1 char,  
it can be anything, I need help with a few queries on this char:


1.) I want all NON ALPHA and NON numbers, so any char that is not A- 
Z or 0-9


I tried:

http://localhost:8983/solr/select?q=NOT%20firstLetterTitle:0%20TO%209%20AND%20NOT%20firstLetterTitle:A%20TO%20Z

But I get back numeric results:

doc
str name=firstLetterTitle9/str
str name=id23946447/str
/doc


2.) I want all only Numerics:

http://localhost:8983/solr/select?q=firstLetterTitle:0%20TO%209

This seems to work but just checking if its the right way.



2.) I want all only English Letters:

http://localhost:8983/solr/select?q=firstLetterTitle:A%20TO%20Z

This seems to work but just checking if its the right way.


thanks
Joel





Proper way to set up Multi Core / Core admin

2009-11-02 Thread Jonathan Hendler
Getting started with multi core setup following http://wiki.apache.org/solr/CoreAdmin 
 and the book. Generally everything makes sense, but I have one  
question.


Here's how easy it was:

place the solr.war into the server
create your core directories in the newly created solr/ directory
set up solr.xml, the config files for a data import handler, the  
[core]/conf/solrconfig.xml [core]/conf/schema.xml, etc
copy the /admin directory present in /solr into each /solr/[core]  
directory


Is step 4 a correct step in the setting up of a multi core environment?

TIA

Re: Proper way to set up Multi Core / Core admin

2009-11-02 Thread Jonathan Hendler

Sorry for the confusion - step four is to be avoided, obviously.


On Nov 2, 2009, at 11:46 PM, Jonathan Hendler wrote:

Getting started with multi core setup following http://wiki.apache.org/solr/CoreAdmin 
 and the book. Generally everything makes sense, but I have one  
question.


Here's how easy it was:

place the solr.war into the server
create your core directories in the newly created solr/ directory
set up solr.xml, the config files for a data import handler, the  
[core]/conf/solrconfig.xml [core]/conf/schema.xml, etc
copy the /admin directory present in /solr into each /solr/[core]  
directory


Is step 4 a correct step in the setting up of a multi core  
environment?


TIA




Simple problem with a nested entity and it's SQL

2009-10-28 Thread Jonathan Hendler
I have a nested entity on a jdbc data import handler that is causing  
an SQL error because the second key is either NULL (blank when  
generating the sql) or non-zero INT.

The query is in the following form:

document name=content
entity name=bl_lessonfiles  
transformer=TemplateTransformer query=SELECT * FROM table1 

  ...
			entity name=user_index query=SELECT *  FROM table2 WHERE  id = $ 
{table1.somethin_like_a_foreign_key} 


/entity
/entity
/document

Is the only way to avoid this to modify the source DB schema to be NOT  
NULL so it always returns at least a 0?


- Jonathan


Re: Simple problem with a nested entity and it's SQL

2009-10-28 Thread Jonathan Hendler

No - the SQL will fail to validate because at runtime it will look like


SELECT *  FROM table2 WHERE
IS NOT NULL table1.somethin_like_a_foreign_key
AND table1.somethin_like_a_foreign_key  0
AND id =




Note the id = 

On Oct 28, 2009, at 1:38 PM, Avlesh Singh wrote:


Shouldn't this work too?
SELECT *  FROM table2 WHERE IS NOT NULL
${table1.somethin_like_a_foreign_key} AND
${table1.somethin_like_a_foreign_key}  0 AND id =
${table1.somethin_like_a_foreign_key}

Cheers
Avlesh

On Wed, Oct 28, 2009 at 11:03 PM, Jonathan Hendler 
jonathan.hend...@gmail.com wrote:

I have a nested entity on a jdbc data import handler that is  
causing an SQL
error because the second key is either NULL (blank when generating  
the sql)

or non-zero INT.
The query is in the following form:

document name=content
  entity name=bl_lessonfiles
transformer=TemplateTransformer query=SELECT * FROM table1 
...
  entity name=user_index query=SELECT *  FROM
table2 WHERE  id = ${table1.somethin_like_a_foreign_key} 
  
  /entity
  /entity
  /document

Is the only way to avoid this to modify the source DB schema to be  
NOT NULL

so it always returns at least a 0?

- Jonathan





Re: Simple problem with a nested entity and it's SQL

2009-10-28 Thread Jonathan Hendler

Thanks - that solution still causes an error.

But it helped me think of an SQL solution like so :
CONVERT ( '${table1.somethin_like_a_foreign_key}' , UNSIGNED INTEGER )

Convert the integer or NULL to a string, then back again. (ugly but it  
works)





On Oct 28, 2009, at 1:48 PM, Avlesh Singh wrote:


Assuming this to be MySQL, will this work -
SELECT *  FROM table2 WHERE id =
IF(ISNULL(${table1.somethin_like_a_foreign_key}), 0,
${table1.somethin_like_a_foreign_key});

Cheers
Avlesh

On Wed, Oct 28, 2009 at 11:12 PM, Jonathan Hendler 
jonathan.hend...@gmail.com wrote:

No - the SQL will fail to validate because at runtime it will look  
like



SELECT *  FROM table2 WHERE

IS NOT NULL table1.somethin_like_a_foreign_key
AND table1.somethin_like_a_foreign_key  0
AND id =




Note the id = 


On Oct 28, 2009, at 1:38 PM, Avlesh Singh wrote:

Shouldn't this work too?

SELECT *  FROM table2 WHERE IS NOT NULL
${table1.somethin_like_a_foreign_key} AND
${table1.somethin_like_a_foreign_key}  0 AND id =
${table1.somethin_like_a_foreign_key}

Cheers
Avlesh

On Wed, Oct 28, 2009 at 11:03 PM, Jonathan Hendler 
jonathan.hend...@gmail.com wrote:

I have a nested entity on a jdbc data import handler that is  
causing an

SQL
error because the second key is either NULL (blank when  
generating the

sql)
or non-zero INT.
The query is in the following form:

document name=content
entity name=bl_lessonfiles
transformer=TemplateTransformer query=SELECT * FROM table1 
  ...
entity name=user_index query=SELECT *  FROM
table2 WHERE  id = ${table1.somethin_like_a_foreign_key} 

/entity
/entity
/document

Is the only way to avoid this to modify the source DB schema to  
be NOT

NULL
so it always returns at least a 0?

- Jonathan








Re: Simple problem with a nested entity and it's SQL

2009-10-28 Thread Jonathan Hendler

THanks - That's a good question.

I thought of using one single SQL statement - but the nested entity's  
query is actually quite complex (unlike the example).

So it'd be possible, but more readable as a separate query.
Further, MySQL has some limitations also about temporary tables and  
seems like separate queries (in terms of performance).





On Oct 28, 2009, at 2:49 PM, Fuad Efendi wrote:



Why can't we use single entity with single SELECT ... LEFT OUTER  
JOIN ...?




-Original Message-
From: Jonathan Hendler [mailto:jonathan.hend...@gmail.com]
Sent: October-28-09 1:33 PM
To: solr-user@lucene.apache.org
Subject: Simple problem with a nested entity and it's SQL

I have a nested entity on a jdbc data import handler that is causing
an SQL error because the second key is either NULL (blank when
generating the sql) or non-zero INT.
The query is in the following form:

document name=content
entity name=bl_lessonfiles
transformer=TemplateTransformer query=SELECT * FROM table1 
  ...
entity name=user_index query=SELECT *  FROM

table2

WHERE  id = $
{table1.somethin_like_a_foreign_key} 

/entity
/entity
/document

Is the only way to avoid this to modify the source DB schema to be  
NOT

NULL so it always returns at least a 0?

- Jonathan







Using Solr for term-completion with counts

2009-10-26 Thread Jonathan Hendler

Greetings all,

We're happily migrating our MySQL fulltext search to SOLR/faceted  
search.


We're doing a term suggest on a large text field and doing  
facet.sort=count. The numbers returned represent total times the term  
shows up - I'd like to have the numbers represent the total documents  
(unique ids) containing the term instead. Can I do this through a  
facet query or another param ? I didn't see this example on the wiki  
or the new book.


In other words to return Math(12 documents) instead of Math(93 times  
word shows up)
query=(+Ma*) 
facet.prefix 
=Mafacet.field=large_index_of_textfacet.sort=countfacet.mincount=1


- Jonathan

Re: Using Solr for term-completion with counts

2009-10-26 Thread Jonathan Hendler

Yonik, Thanks.

One question inline below:

On Oct 26, 2009, at 5:30 PM, Yonik Seeley wrote:



Can I do this through a facet query or another param ?


Faceting should also work... the terms component is very much like the
faceting component except that it always works over the complete index
(including deleted docs!) instead of a subset of it.


 Aren't deleted docs purged on delta updates? Maybe I misunderstand  
what you're describing above?





-Yonik
http://www.lucidimagination.com