Re: Adding field to solr dynamically

2013-10-15 Thread Mysurf Mail
Thanks.


On Sun, Oct 13, 2013 at 4:18 PM, Jack Krupansky j...@basetechnology.comwrote:

 Either simply use a dynamic field, or use the Schema API to add a static
 field:
 https://cwiki.apache.org/**confluence/display/solr/**Schema+APIhttps://cwiki.apache.org/confluence/display/solr/Schema+API

 Dynamic fields (your nominal field name plus a suffix that specifies the
 type and muliplicity - as detailed in the Solr example schema) may be good
 enough, depending on the rest of your requirements.

 -- Jack Krupansky

 -Original Message- From: Mysurf Mail
 Sent: Sunday, October 13, 2013 5:32 AM
 To: solr-user@lucene.apache.org
 Subject: Adding field to solr dynamically


 My database model is designed using dynamic attributes (Entity Attribute
 Value model). For the db I have a service that adds a new attribute. But
 everytime a new attributes is added I need to add it to the schema.xml

 Is there a possibile way to add a field to solr schama.xml dynamically?



Adding field to solr dynamically

2013-10-13 Thread Mysurf Mail
My database model is designed using dynamic attributes (Entity Attribute
Value model). For the db I have a service that adds a new attribute. But
everytime a new attributes is added I need to add it to the schema.xml

Is there a possibile way to add a field to solr schama.xml dynamically?


Re: How to define facet.prefix as case-insensitive

2013-09-23 Thread Mysurf Mail
thanks.


On Sun, Sep 22, 2013 at 6:24 PM, Erick Erickson erickerick...@gmail.comwrote:

 You'll have to lowercase the term in your app and set
 terms.prefix to that value, there's no analysis done
 on the terms.prefix value.

 Best,
 Erick

 On Sun, Sep 22, 2013 at 4:07 AM, Mysurf Mail stammail...@gmail.com
 wrote:
  I am using facet.prefix for auto complete.
  This is my definition
 
   requestHandler name=/ac class=solr.SearchHandler
   lst name=defaults
str name=echoParamsexplicit/str
...
str name=lowercaseOperatorstrue/str
str name=faceton/str
str name=facet.fieldSuggest/str
  /lst
 
  this is my field
 
  field name=Suggest type=text_auto indexed=true stored=true
  required=false multiValued=true/
 
  and
 
   fieldType class=solr.TextField name=text_auto
analyzer
  tokenizer class=solr.KeywordTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
/analyzer
  /fieldType
 
  all works fine but when I search using caps lock it doesn't return
 answers.
  Even when the field contains capitals letters - it doesn't.
 
  I assume that the field in solr is lowered (from the field type filter
  definition) but the search term is not.
  How can I control the search term caps/no caps?
 
  Thanks.



solr - searching part of words

2013-09-23 Thread Mysurf Mail
My field is defined as

field name=PackageName type=text_en indexed=true stored=true
required=true/

*text_en is defined as in the original schema.xml that comes with solr

Now, my field has the following vaues

   - one
   - one1

searching for one returns only the field one. What causes it? how can I
change it?


How to define facet.prefix as case-insensitive

2013-09-22 Thread Mysurf Mail
I am using facet.prefix for auto complete.
This is my definition

 requestHandler name=/ac class=solr.SearchHandler
 lst name=defaults
  str name=echoParamsexplicit/str
  ...
  str name=lowercaseOperatorstrue/str
  str name=faceton/str
  str name=facet.fieldSuggest/str
/lst

this is my field

field name=Suggest type=text_auto indexed=true stored=true
required=false multiValued=true/

and

 fieldType class=solr.TextField name=text_auto
  analyzer
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType

all works fine but when I search using caps lock it doesn't return answers.
Even when the field contains capitals letters - it doesn't.

I assume that the field in solr is lowered (from the field type filter
definition) but the search term is not.
How can I control the search term caps/no caps?

Thanks.


Re: solr suggestion -

2013-09-10 Thread Mysurf Mail
Yes.
I understood that from the result.
But how do I change that behaviour?

Don't do any analysis on the field you are using for suggestion

Please elaborate.


On Mon, Sep 9, 2013 at 8:48 PM, tamanjit.bin...@yahoo.co.in 
tamanjit.bin...@yahoo.co.in wrote:

 Don't do any analysis on the field you are using for suggestion. What is
 happening here is that query time and indexing time the tokens are being
 broken on white space. So effectively, at is being taken as one token and
 l is being taken as another token for which you get two different
 suggestions.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/solr-suggestion-tp4087841p4088919.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr suggest - How to define solr suggest as case insensitive

2013-09-10 Thread Mysurf Mail
I have added it and it didnt work. Still returning different result to 1=C
and q=c


On Tue, Sep 10, 2013 at 1:52 AM, Chris Hostetter
hossman_luc...@fucit.orgwrote:


 : This is probably because your dictionary is made up of all lower case
 tokens,
 : but when you query the spell-checker similar analysis doesnt happen.
 Ideal
 : case would be when you query the spellchecker you send lower case queries

 You can init the SpellCheckComponent with a queryAnalyzerFieldType
 option that will control what analysis happens.  ie...

   !-- This field type's analyzer is used by the QueryConverter to
 tokenize the value for q parameter --
   str name=queryAnalyzerFieldTypephrase_suggest/str


 ...it would be nice if this defaulted to using the fieldType of hte field
 you configure on the Suggester, but not all Impls are based on the index
 (you might be using an external dict file) so it has to be explicitly
 configured, and defaults to using a simple WhitespaceAnalyzer.


 -Hoss



Solr Suggester - How do I filter autocomplete results

2013-09-10 Thread Mysurf Mail
I want to filter the auto complete results from my suggester Lets say I
have a book table

Table (Id Guid, BookName String, BookOwner id)

I want each user to get a list to autocomplete from its own books.

I want to add something like the

http://.../solr/vault/suggest?q=cfq=BookOwner:3

This doesnt work. What other ways do I have to implement it?


Solr doesnt return answer when searching numbers

2013-09-10 Thread Mysurf Mail
I am querying using

http://...:8983/solr/vault/select?q=design testfl=PackageName

I get 3 result:

   - design test
   - design test 2013
   - design test for jobs

Now when I query using q=test for jobs
- I get only design test for jobs

But when I query using q = 2013

http://...:8983/solr/vault/select?q=2013fl=PackageName

I get no result. Why doesnt it return an answer when I query with numbers?

In schema xml

 field name=PackageName type=text_en indexed=true stored=true
required=true/


Solr suggest - How to define solr suggest as case insensitive

2013-09-08 Thread Mysurf Mail
My suggest (spellchecker) is returning case sensitive answers. (I use it to
autocomplete - dog and Dog return different phrases)\

my suggest is defined as follows - in solrconfig -

 searchComponent class=solr.SpellCheckComponent name=suggest
lst name=spellchecker
str name=namesuggest/str
str name=classnameorg.apache.solr.spelling.suggest.Suggester/str
str name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str
str name=fieldsuggest/str  !-- the indexed field to derive
suggestions from --
float name=threshold0.005/float
str name=buildOnCommittrue/str
!--str name=sourceLocationamerican-english/str--
/lst
/searchComponent
requestHandler
class=org.apache.solr.handler.component.SearchHandler
name=/suggest
lst name=defaults
str name=spellchecktrue/str
str name=spellcheck.dictionarysuggest/str
str name=spellcheck.onlyMorePopulartrue/str
str name=spellcheck.count5/str
str name=spellcheck.collatetrue/str
/lst
arr name=components
strsuggest/str
/arr
/requestHandler

in schema

field name=suggest type=phrase_suggest indexed=true
stored=true required=false multiValued=true/

and

copyField source=Name dest=suggest/

and

fieldtype name=phrase_suggest class=solr.TextField
  analyzer
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.PatternReplaceFilterFactory

pattern=([^\p{L}\p{M}\p{N}\p{Cs}]*[\p{L}\p{M}\p{N}\p{Cs}\_]+:)|([^\p{L}\p{M}\p{N}\p{Cs}])+
replacement=  replace=all/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.TrimFilterFactory/
  /analyzer
/fieldtype


Problem parsing suggest response

2013-09-02 Thread Mysurf Mail
Hi,
I am having problems parsing suggest json response in c#.
Here is an example

{

   - responseHeader:
   {
  - status: 0,
  - QTime: 1
  },
   - spellcheck:
   {
  - suggestions:
  [
 - at,
 -
 {
- numFound: 1,
- startOffset: 1,
- endOffset: 3,
- suggestion:
[
   - atrion
   ]
},
 - l,
 -
 {
- numFound: 2,
- startOffset: 4,
- endOffset: 5,
- suggestion:
[
   - lot,
   - loadtest_template_700
   ]
},
 - collation,
 - atrion lot
 ]
  }

}

1. Is this a valid json? Shouldnt every item be surrounded by quatation
marks?
2. The items at and l are not preceded by name.
(This generates different xml in every online json-to-xml translater.
Is this a standard json?
Can I interfere with the structure?

Thanks.


solr suggestion -

2013-09-02 Thread Mysurf Mail
the following request
http://127.0.0.1:8983/solr/vault/suggest?wt=jsonq=at%20l


Returns phrases that starts with at and with l (as shown below )
Now, what if I want phrases that starts with At l such as At Least...
Thanks.


{

   - responseHeader:
   {
  - status: 0,
  - QTime: 1
  },
   - spellcheck:
   {
  - suggestions:
  [
 - at,
 -
 {
- numFound: 1,
- startOffset: 1,
- endOffset: 3,
- suggestion:
[
   - atrion
   ]
},
 - l,
 -
 {
- numFound: 2,
- startOffset: 4,
- endOffset: 5,
- suggestion:
[
   - lot,
   - loadtest_template_700
   ]
},
 - collation,
 - atrion lot
 ]
  }

}


Troubles defining suggester/ understanding results

2013-08-15 Thread Mysurf Mail
I am having troubles defining suggester for auto complete after reading the
tutorial.

Here are my shcema definitions:

field name=PackageName type=text_en indexed=true stored=true
required=true/
field name=PackageVersionComments type=text_en indexed=true
stored=true required=false/
 ...
field name=SKUDescription type=text_en indexed=true
stored=true required=false multiValued=true/

I also added two field types

fieldtype name=text class=solr.TextField
  analyzer
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldtype


fieldtype name=phrase_suggest class=solr.TextField
  analyzer
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.PatternReplaceFilterFactory

pattern=([^\p{L}\p{M}\p{N}\p{Cs}]*[\p{L}\p{M}\p{N}\p{Cs}\_]+:)|([^\p{L}\p{M}\p{N}\p{Cs}])+
replacement=  replace=all/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.TrimFilterFactory/
  /analyzer
/fieldtype

Now since I want to make suggestions from multiple fields and I cant
declare two fields I defined :

and copied three of the fields using :

Problems:
1. everything loads pretty well. but copying the fields to a new fields
just inflate my index. is there a possiblity to define the suggester on
mopre then one field? 2. I cant understand the results. querying

 http://127.0.0.1:8983/solr/Book/suggest?q=th

returns docs such as
that are labelled in black on a black background a little black light
though quering

 http://127.0.0.1:8983/solr/vault-Book/suggest?q=lab

doesnt return anything.
lab is found in the previous result as well.
What is the problem?


Re: autocomplete feature - where to begin

2013-08-14 Thread Mysurf Mail
Thanks. Will read it now :-)


On Tue, Aug 13, 2013 at 8:33 PM, Cassandra Targett casstarg...@gmail.comwrote:

 The autocomplete feature in Solr is built on the spell checker
 component, and is called Suggester, which is why you've seen both of
 those mentioned. It's implemented with a searchComponent and a
 requestHandler.

 The Solr Reference Guide has a decent overview of how to implement it
 and I just made a few edits to make what needs to be done a bit more
 clear:

 https://cwiki.apache.org/confluence/display/solr/Suggester

 If you have suggestions for improvements to that doc (such as steps
 that aren't clear), you're welcome to set up an account there and
 leave a comment.

 Cassandra

 On Tue, Aug 13, 2013 at 11:16 AM, Mysurf Mail stammail...@gmail.com
 wrote:
  I have indexed the data from the db and so far it searches really well.
  Now I want to create auto-complete/suggest feature in my website
  So far I have seen articles about Suggester, spellchecker, and
  searchComponents.
  Can someone point me to a good article about basic autocomplete
  implementation?



solr not writing logs when it runs not from its main folder

2013-08-13 Thread Mysurf Mail
When I run solr using

java -jar C:\solr\example\start.jar

It writes logs to C:\solr\example\logs.

When I run it using

java -Dsolr.solr.home=C:\solr\example\solr
 -Djetty.home=C:\solr\example
 -Djetty.logs=C:\solr\example\logs
 -jar C:\solr\example\

start.jar

it writes logs only if I run it from

C:\solr\example

any other folder - logs are not written.
This is important as I need to run it as a service later (using nssm) What
should I change?


autocomplete feature - where to begin

2013-08-13 Thread Mysurf Mail
I have indexed the data from the db and so far it searches really well.
Now I want to create auto-complete/suggest feature in my website
So far I have seen articles about Suggester, spellchecker, and
searchComponents.
Can someone point me to a good article about basic autocomplete
implementation?


Solr - how do I index barcode

2013-08-08 Thread Mysurf Mail
I have a documnet that contains the following data

car {
id: guid
name:   string
sku:   listbarcode
}

Now, The barcodes dont have a pattern. It can be either one of the
follwings:

ABCD-EF34GD-JOHN
ABCD-C08-YUVF

I want to index my documents so that search for
1. ABCD will return both.
2. AB will return both.
3. JO - will return ABCD-EF34GD-JOHN but not car with name john.

so far I have defined car and sku as text_en.
But I dont get bulletes no 2 and 3.
IS there a better way to define sku attribute.
Thanks.


Re: Solr - how do I index barcode

2013-08-08 Thread Mysurf Mail
2. notes
1. My current query is similiar to this
http://127.0.0.1:8983/solr/vault/select?q=ABCDqf=Name+SKUdefType=edismax

2. I want it to be case insensitive




On Thu, Aug 8, 2013 at 2:52 PM, Mysurf Mail stammail...@gmail.com wrote:

 I have a documnet that contains the following data

 car {
 id: guid
 name:   string
 sku:   listbarcode
 }

 Now, The barcodes dont have a pattern. It can be either one of the
 follwings:

 ABCD-EF34GD-JOHN
 ABCD-C08-YUVF

 I want to index my documents so that search for
 1. ABCD will return both.
 2. AB will return both.
 3. JO - will return ABCD-EF34GD-JOHN but not car with name john.

 so far I have defined car and sku as text_en.
 But I dont get bulletes no 2 and 3.
 IS there a better way to define sku attribute.
 Thanks.



Re: solr - using fq parameter does not retrieve an answer

2013-08-06 Thread Mysurf Mail
Thanks.


On Mon, Aug 5, 2013 at 4:57 PM, Shawn Heisey s...@elyograg.org wrote:

 On 8/5/2013 2:35 AM, Mysurf Mail wrote:
  When I query using
 
  http://localhost:8983/solr/vault/select?q=*:*
 
  I get reuslts including the following
 
  doc
...
...
int name=VersionNumber7/int
...
  /doc
 
  Now I try to get only that row so I add to my query fq=VersionNumber:7
 
  http://localhost:8983/solr/vault/select?q=*:*fq=VersionNumber:7
 
  And I get nothing.
  Any idea?

 Is the VersionNumber field indexed?  If it's not, you won't be able to
 search on it.

 If you change your schema so that the field has 'indexed=true, you'll
 have to reindex.

 http://wiki.apache.org/solr/HowToReindex

 When you are retrieving a single document, it's better to use the q
 parameter rather than the fq parameter.  Querying a single document will
 pollute the cache.  It's a lot better to pollute the queryResultCache
 than the filterCache.  The former is generally much larger than the
 latter and better able to deal with pollution.

 Thanks,
 Shawn




Knowing what field caused the retrival of the document

2013-08-06 Thread Mysurf Mail
I have two indexed fields in my document.- Name, Comment.
The user searches for a phrase and I need to act differently if it appeared
in the comment or the name.
Is there a way to know why the document was retrieved?
Thanks.


How to plan field boosting

2013-08-06 Thread Mysurf Mail
I query using

qf=Name+Tag

Now I want that documents that have the phrase in tag will arrive first so
I use

qf=Name+Tag^2

and they do appear first.


What should be the rule of thumb regarding the number that comes after the
field?
How do I know what number to set it?


Re: Knowing what field caused the retrival of the document

2013-08-06 Thread Mysurf Mail
But what if this for multiple words ?
I am guessing solr knows why the document is there since I get to see the
paragraph in the highlight.(hl) section.


On Tue, Aug 6, 2013 at 11:36 AM, Raymond Wiker rwi...@gmail.com wrote:

 If you were searching for single words (terms), you could use the 'tf'
 function, by adding something like

 matchesinname:tf(name, whatever)

 to the 'fl' parameter - if the 'name' field contains whatever, the
 (result) field 'matchesinname' will be 1.




 On Tue, Aug 6, 2013 at 10:24 AM, Mysurf Mail stammail...@gmail.com
 wrote:

  I have two indexed fields in my document.- Name, Comment.
  The user searches for a phrase and I need to act differently if it
 appeared
  in the comment or the name.
  Is there a way to know why the document was retrieved?
  Thanks.
 



Multiple sorting does not work as expected

2013-08-06 Thread Mysurf Mail
My documents has 2 indexed attribute - name (string) and version (number)
I want within the same score the documents will be displayed by the
following order

score(desc),name(desc),version(desc)

Therefor I query using :

http://localhost:8983/solr/vault/select?
   q=BOMfl=*:score
   sort=score+desc,Name+desc,Version+desc

And I get the following inside the result:

doc
   str name=NameBOM Total test2/str
   ...
   int name=Version2/int
   ...
   float name=score2.2388418/float
/doc
doc
   str name=NameBOM Total test - Copy/str
   ...
   int name=Version2/int
   ...
   float name=score2.2388418/float
/doc
doc
  str name=NameBOM Total test2/str
  ...
  int name=Version1/int
  ...
  float name=score2.2388418/float
/doc

The scoring is equal, but the name is not sorted.

What am I doing wrong here?


Re: Multiple sorting does not work as expected

2013-08-06 Thread Mysurf Mail
my schema


 field name=Name type=text_en indexed=true stored=true
required=true/
 field name=Version type=int indexed=true stored=true required=true/
 



On Tue, Aug 6, 2013 at 5:06 PM, Mysurf Mail stammail...@gmail.com wrote:

 My documents has 2 indexed attribute - name (string) and version (number)
 I want within the same score the documents will be displayed by the
 following order

 score(desc),name(desc),version(desc)

 Therefor I query using :

 http://localhost:8983/solr/vault/select?
q=BOMfl=*:score
sort=score+desc,Name+desc,Version+desc

 And I get the following inside the result:

 doc
str name=NameBOM Total test2/str
...
int name=Version2/int
...
float name=score2.2388418/float
 /doc
 doc
str name=NameBOM Total test - Copy/str
...
int name=Version2/int
...
float name=score2.2388418/float
 /doc
 doc
   str name=NameBOM Total test2/str
   ...
   int name=Version1/int
   ...
   float name=score2.2388418/float
 /doc

 The scoring is equal, but the name is not sorted.

 What am I doing wrong here?



Re: Multiple sorting does not work as expected

2013-08-06 Thread Mysurf Mail
I don't see how it is sorted.
this is the order as displayed above

1- BOM Total test2
2- BOM Total test - Copy
3- BOM Total test2

all in the same  2.2388418 score


On Tue, Aug 6, 2013 at 5:28 PM, Jack Krupansky j...@basetechnology.comwrote:

 The Name field is sorted as you have requested - desc. I suspect that
 you wanted name to be sorted asc (natural order.)

 -- Jack Krupansky

 -Original Message- From: Mysurf Mail
 Sent: Tuesday, August 06, 2013 10:22 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Multiple sorting does not work as expected


 my schema

 
 field name=Name type=text_en indexed=true stored=true
 required=true/
 field name=Version type=int indexed=true stored=true
 required=true/
 



 On Tue, Aug 6, 2013 at 5:06 PM, Mysurf Mail stammail...@gmail.com wrote:

  My documents has 2 indexed attribute - name (string) and version (number)
 I want within the same score the documents will be displayed by the
 following order

 score(desc),name(desc),**version(desc)

 Therefor I query using :

 http://localhost:8983/solr/**vault/selecthttp://localhost:8983/solr/vault/select
 ?
q=BOMfl=*:score
sort=score+desc,Name+desc,**Version+desc

 And I get the following inside the result:

 doc
str name=NameBOM Total test2/str
...
int name=Version2/int
...
float name=score2.2388418/float
 /doc
 doc
str name=NameBOM Total test - Copy/str
...
int name=Version2/int
...
float name=score2.2388418/float
 /doc
 doc
   str name=NameBOM Total test2/str
   ...
   int name=Version1/int
   ...
   float name=score2.2388418/float
 /doc

 The scoring is equal, but the name is not sorted.

 What am I doing wrong here?





solr - using fq parameter does not retrieve an answer

2013-08-05 Thread Mysurf Mail
When I query using

http://localhost:8983/solr/vault/select?q=*:*

I get reuslts including the following

doc
  ...
  ...
  int name=VersionNumber7/int
  ...
/doc

Now I try to get only that row so I add to my query fq=VersionNumber:7

http://localhost:8983/solr/vault/select?q=*:*fq=VersionNumber:7

And I get nothing.
Any idea?


Re: solr - please help me arrange my search url

2013-08-04 Thread Mysurf Mail
So,
If I query over more than one field always. And they are always the same
fields then I cannot place them in a config file.
I should always list all them in my url?



On Thu, Aug 1, 2013 at 5:05 PM, Jack Krupansky j...@basetechnology.comwrote:

 1. df only supports a single field. All but the first will be ignored.
 2. qf takes a list as a space-delimited string, with optional boost (^n)
 after each field name.
 3. df is only used by edismax if qf is not present.
 3. Your working query uses a different term (walk) than your other
 queries (jump).

 Are you sure that jump appears in that field? What does your field
 analyzer look like? Or is it a string field? If the latter, does the case
 match exactly and are there any extraneous spaces?

 -- Jack Krupansky

 -Original Message- From: Mysurf Mail
 Sent: Thursday, August 01, 2013 7:48 AM
 To: solr-user@lucene.apache.org
 Subject: solr - please help me arrange my search url


 I am still doing something wrong with solr.

 I am querying with the following parameters

 http://...:8983/solr/vault/**select?q=jumpqf=PackageTag**defType=edismax

 (meaning I am using edismax and I query on the field PackageTag )

 I get nothing.

 when I dont declare the field and query

 http://...:8983/solr/vault/**select?q=jumpdefType=edismax

 and declare the searched on fileds in

 lst name=defaults
   str name=echoParamsexplicit/**str
   int name=rows10/int
   str name=dfPackageName/str
   str name=dfPackageTag/str
   

 I get also nothing

 Its only when I query with

 http://...:8983/solr/vault/**select?q=PackageTag:walk**defType=edismax

 My goal is to have two kinds of url -

   1. one that will query without getting the SearchedOn fields.

   I will put default declaration in another place (where then?)
   2. one that will query with getting the SearchedOn fields.

   should I use dismax?edismax? qf or q=..:...

 Thanks.



solr - please help me arrange my search url

2013-08-01 Thread Mysurf Mail
I am still doing something wrong with solr.

I am querying with the following parameters

http://...:8983/solr/vault/select?q=jumpqf=PackageTagdefType=edismax

(meaning I am using edismax and I query on the field PackageTag )

I get nothing.

when I dont declare the field and query

http://...:8983/solr/vault/select?q=jumpdefType=edismax

and declare the searched on fileds in

lst name=defaults
   str name=echoParamsexplicit/str
   int name=rows10/int
   str name=dfPackageName/str
   str name=dfPackageTag/str
   

I get also nothing

Its only when I query with

http://...:8983/solr/vault/select?q=PackageTag:walkdefType=edismax

My goal is to have two kinds of url -

   1. one that will query without getting the SearchedOn fields.
   I will put default declaration in another place (where then?)
   2. one that will query with getting the SearchedOn fields.
   should I use dismax?edismax? qf or q=..:...

Thanks.


Working with solr over two different db schemas

2013-07-31 Thread Mysurf Mail
Been working on it for quitre some time.

this is my config



dataConfig
dataSource type=JdbcDataSource name=ds1
driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
url=jdbc:sqlserver://...:1433;databaseName=A
user=XX password=XX /
document

  entity name=PackageVersion pk=PackageVersionId
query= /*PackageVersion.Query*/ select PackageVersion.Id PackageVersionId,
PackageVersion.VersionNumber, CONVERT(char(19),
PackageVersion.LastModificationTime ,126) + 'Z' LastModificationTime,
Package.Id PackageId, Package.Name
PackageName, PackageVersion.Comments PackageVersionComments,
Package.CreatedBy CreatedBy
from [dbo].[Package] Package inner join [dbo].[PackageVersion]
PackageVersion on Package.Id = PackageVersion.PackageId
where Package.RecordStatusId=0 and PackageVersion.RecordStatusId=0
 entity name=PackageTag pk=ResourceId
processor=CachedSqlEntityProcessor cacheKey=ResourceId
cacheLookup=PackageVersion.PackageId
query=/*PackageTag.Query*/
select ResourceId,[Text] PackageTag
from [dbo].[Tag] Tag
Where ResourceType = 0/
/entity
  /document
/dataConfig

Now, this runs in my test env and the only thing I do is change the
configuration to another db( and as a result also the schema name from
[dbo] to another )
This result in a totally different behavior.
In the first configuration the selects were done be this order - inner
object and then outer object. which means that the cache works.
In the second configuration - over the other db the order was first the
outer and then the inner. cache did not work at all.
the inner query is not stored at all.

What could be the problem?


solr - set fileds as default search field

2013-07-29 Thread Mysurf Mail
The following query works well for me

http://[]:8983/solr/vault/select?q=VersionComments%3AWhite

returns all the documents where version comments includes White

I try to omit the field name and put it as a default value as follows : In
solr config I write

requestHandler name=/select class=solr.SearchHandler
!-- default values for query parameters can be specified, these
 will be overridden by parameters in the request
  --
 lst name=defaults
   str name=echoParamsexplicit/str
   int name=rows10/int
   str name=dfPackageName/str
   str name=dfTag/str
   str name=dfVersionComments/str
   str name=dfVersionTag/str
   str name=dfDescription/str
   str name=dfSKU/str
   str name=dfSKUDesc/str
 /lst

I restart the solr and create a full import.
Then I try using

 http://[]:8983/solr/vault/select?q=White

(Where

 http://[]:8983/solr/vault/select?q=VersionComments%3AWhite

still works)

But I dont get the document any as answer.
What am I doing wrong?


Re: adding date column to the index

2013-07-23 Thread Mysurf Mail
Ahaa
I deleted the data folder and now I get
Invalid Date String:'2010-01-01 00:00:00 +02:00'
I need to cast it to solr. as I read it in the schema using

field name=LastModificationTime type=date indexed=false
stored=true required=true/


On Tue, Jul 23, 2013 at 10:50 AM, Gora Mohanty g...@mimirtech.com wrote:

 On 23 July 2013 11:13, Mysurf Mail stammail...@gmail.com wrote:
  clarify: I did deleted the data in the index and reloaded it (+ commit).
  (As i said, I have seen it loaded in the sb profiler)
 [...]

 Please share your DIH configuration file, and Solr's
 schema.xml. It must be that somehow the column
 is not getting indexed.

 Regards,
 Gora



Re: adding date column to the index

2013-07-23 Thread Mysurf Mail
How do I cast datetimeoffset(7)) to solr date


On Tue, Jul 23, 2013 at 11:11 AM, Mysurf Mail stammail...@gmail.com wrote:

 Ahaa
 I deleted the data folder and now I get
 Invalid Date String:'2010-01-01 00:00:00 +02:00'
 I need to cast it to solr. as I read it in the schema using

 field name=LastModificationTime type=date indexed=false
 stored=true required=true/


 On Tue, Jul 23, 2013 at 10:50 AM, Gora Mohanty g...@mimirtech.com wrote:

 On 23 July 2013 11:13, Mysurf Mail stammail...@gmail.com wrote:
  clarify: I did deleted the data in the index and reloaded it (+ commit).
  (As i said, I have seen it loaded in the sb profiler)
 [...]

 Please share your DIH configuration file, and Solr's
 schema.xml. It must be that somehow the column
 is not getting indexed.

 Regards,
 Gora





solr - Deleting a row from the index, using the configuration files only.

2013-07-23 Thread Mysurf Mail
I am updating my solr index using deltaQuery and deltaImportQuery
attributes in data-config.xml.
In my condition I write

where MyDoc.LastModificationTime  '${dataimporter.last_index_time}'
then after I add a row I trigger an update using data-config.xml.

Now, sometimes I delete a row.
How can I implement this with configuration files only
(without sending a delete rest command to solr ).

Lets say my object is not deleted but its status is changed to deleted.
I dont index that status field, as I want to hold only the live rows.
(otherwise I could have just filtered it)
Is there a way to do it?
thanks.


filter query result by user

2013-07-23 Thread Mysurf Mail
I want to restrict the returned results to be only the documents that were
created by the user.
I then load to the index the createdBy attribute and set it to index
false,stored=true

field name=CreatedBy type=string indexed=false stored=true
required=true/

then in the I want to filter by CreatedBy so I use the dashboard, check
edismax and add
I check edismax and add CreatedBy:user1 to the qf field.


the result query is

http://
:8983/solr/vault/select?q=*%3A*defType=edismaxqf=CreatedBy%3Auser1

Nothing is filtered. all rows returned.
What was I doing wrong?


Re: filter query result by user

2013-07-23 Thread Mysurf Mail
But I dont want it to be searched.on

lets say the user name is giraffe
I do want to filter to be where created by = giraffe

but when the user searches his name, I will want only documents with name
Giraffe.
since it is indexed, wouldn't it return all rows created by him?
Thanks.



On Tue, Jul 23, 2013 at 4:28 PM, Raymond Wiker rwi...@gmail.com wrote:

 Simple: the field needs to be indexed in order to search (or filter) on
 it.


 On Tue, Jul 23, 2013 at 3:26 PM, Mysurf Mail stammail...@gmail.com
 wrote:

  I want to restrict the returned results to be only the documents that
 were
  created by the user.
  I then load to the index the createdBy attribute and set it to index
  false,stored=true
 
  field name=CreatedBy type=string indexed=false stored=true
  required=true/
 
  then in the I want to filter by CreatedBy so I use the dashboard, check
  edismax and add
  I check edismax and add CreatedBy:user1 to the qf field.
 
 
  the result query is
 
  http://
  :8983/solr/vault/select?q=*%3A*defType=edismaxqf=CreatedBy%3Auser1
 
  Nothing is filtered. all rows returned.
  What was I doing wrong?
 



Re: filter query result by user

2013-07-23 Thread Mysurf Mail
I am probably using it wrong.
http://
...:8983/solr/vault10k/select?q=*%3A*defType=edismaxqf=CreatedBy%BLABLA
returns all rows.
It neglects my qf filter.

Should I even use qf for filtrering with edismax?
(It doesnt say that in the doc
http://wiki.apache.org/solr/ExtendedDisMax#qf_.28Query_Fields.29)



On Tue, Jul 23, 2013 at 4:32 PM, Mysurf Mail stammail...@gmail.com wrote:

 But I dont want it to be searched.on

 lets say the user name is giraffe
 I do want to filter to be where created by = giraffe

 but when the user searches his name, I will want only documents with name
 Giraffe.
 since it is indexed, wouldn't it return all rows created by him?
 Thanks.



 On Tue, Jul 23, 2013 at 4:28 PM, Raymond Wiker rwi...@gmail.com wrote:

 Simple: the field needs to be indexed in order to search (or filter) on
 it.


 On Tue, Jul 23, 2013 at 3:26 PM, Mysurf Mail stammail...@gmail.com
 wrote:

  I want to restrict the returned results to be only the documents that
 were
  created by the user.
  I then load to the index the createdBy attribute and set it to index
  false,stored=true
 
  field name=CreatedBy type=string indexed=false stored=true
  required=true/
 
  then in the I want to filter by CreatedBy so I use the dashboard,
 check
  edismax and add
  I check edismax and add CreatedBy:user1 to the qf field.
 
 
  the result query is
 
  http://
  :8983/solr/vault/select?q=*%3A*defType=edismaxqf=CreatedBy%3Auser1
 
  Nothing is filtered. all rows returned.
  What was I doing wrong?
 





adding date column to the index

2013-07-22 Thread Mysurf Mail
I have added a date field to my index.
I dont want the query to search on this field, but I want it to be returned
with each row.
So I have defined it in the scema.xml as follows:
  field name=LastModificationTime type=date indexed=false
stored=true required=true/



I added it to the select in data-config.xml and I see it selected in the
profiler.
now, when I query all fileds (using the dashboard) I dont see it.
Even when I ask for it specifically I dont see it.
What am I doing wrong?

(In the db it is (datetimeoffset(7)))


deserializing highlighting json result

2013-07-22 Thread Mysurf Mail
When I request a json result I get the following streucture in the
highlighting

{highlighting:{
   394c65f1-dfb1-4b76-9b6c-2f14c9682cc9:{
  PackageName:[- emTestingem channel twenty.]},
   baf8434a-99a4-4046-8a4d-2f7ec09eafc8:{
  PackageName:[- emTestingem channel twenty.]},
   0a699062-cd09-4b2e-a817-330193a352c1:{
 PackageName:[- emTestingem channel twenty.]},
   0b9ec891-5ef8-4085-9de2-38bfa9ea327e:{
 PackageName:[- emTestingem channel twenty.]}}}


It is difficult to deserialize this json because the guid is in the
attribute name.
Is that solveable (using c#)?


Re: adding date column to the index

2013-07-22 Thread Mysurf Mail
clarify: I did deleted the data in the index and reloaded it (+ commit).
(As i said, I have seen it loaded in the sb profiler)
Thanks for your comment.


On Mon, Jul 22, 2013 at 9:25 PM, Lance Norskog goks...@gmail.com wrote:

 Solr/Lucene does not automatically add when asked, the way DBMS systems
 do. Instead, all data for a field is added at the same time. To get the new
 field, you have to reload all of your data.

 This is also true for deleting fields. If you remove a field, that data
 does not go away until you re-index.


 On 07/22/2013 07:31 AM, Mysurf Mail wrote:

 I have added a date field to my index.
 I dont want the query to search on this field, but I want it to be
 returned
 with each row.
 So I have defined it in the scema.xml as follows:
field name=LastModificationTime type=date indexed=false
 stored=true required=true/



 I added it to the select in data-config.xml and I see it selected in the
 profiler.
 now, when I query all fileds (using the dashboard) I dont see it.
 Even when I ask for it specifically I dont see it.
 What am I doing wrong?

 (In the db it is (datetimeoffset(7)))





Re: deserializing highlighting json result

2013-07-22 Thread Mysurf Mail
the guid appears as the attribute id and not as

id:baf8434a-99a4-4046-8a4d-2f7ec09eafc8:

Trying to create an object that holds this guid will create an attribute
with name baf8434a-99a4-4046-8a4d-2f7ec09eafc8

On Mon, Jul 22, 2013 at 6:30 PM, Jack Krupansky j...@basetechnology.comwrote:

 Exactly why is it difficult to deserialize? Seems simple enough.

 -- Jack Krupansky

 -Original Message- From: Mysurf Mail Sent: Monday, July 22, 2013
 11:14 AM To: solr-user@lucene.apache.org Subject: deserializing
 highlighting json result
 When I request a json result I get the following streucture in the
 highlighting

 {highlighting:{
   394c65f1-dfb1-4b76-9b6c-**2f14c9682cc9:{
  PackageName:[- emTestingem channel twenty.]},
   baf8434a-99a4-4046-8a4d-**2f7ec09eafc8:{
  PackageName:[- emTestingem channel twenty.]},
   0a699062-cd09-4b2e-a817-**330193a352c1:{
 PackageName:[- emTestingem channel twenty.]},
   0b9ec891-5ef8-4085-9de2-**38bfa9ea327e:{
 PackageName:[- emTestingem channel twenty.]}}}


 It is difficult to deserialize this json because the guid is in the
 attribute name.
 Is that solveable (using c#)?



Running Solr in a cluster - high availability only

2013-07-15 Thread Mysurf Mail
Hi,
I would like to run two Solr instances on different computers as a cluster.
My main interest is High availability - meaning, in case one server crashes
or is down there will be always another one.

(my performances on a single instance are great. I do not need to split the
data to two servers.)

Questions:
1. What is the best practice?
Is it different than clustering for index splitting? Do I need Shards?
2. Do I need zoo keeper?
3. Is it a container based configuration (different for jetty and tomcat)
4, Do I need an external NLB for that ?
5. When one computer is up after crashing. how dows it updates its index?


Re: two types of answers in my query

2013-07-10 Thread Mysurf Mail
This will work.
Thanks.


On Tue, Jul 9, 2013 at 4:37 PM, Jack Krupansky j...@basetechnology.comwrote:

 Usually a car term and a car part term will look radically different. So,
 simply use the edismax query parser and set qf to be both the car and car
 part fields. If either matches, the document will be selected. And if you
 have a type field, you can check that to see if a car or part was matched
 in the results.

 -- Jack Krupansky

 -Original Message- From: Mysurf Mail
 Sent: Tuesday, July 09, 2013 2:38 AM
 To: solr-user@lucene.apache.org
 Subject: two types of answers in my query


 Hi,
 A general question:


 Let's say I have Car And CarParts 1:n relation.

 And I have discovered that the user had entered in the search field instead
 of car name - a part serial number (SKU).
 (I discovered it useing regex)

 Is there a way to fetch different types of answers in Solr?
 Is there a way to fetch mixed types in the answers?
 Is there something similiar to that and how is that feature called?

 Thank you.



Disabling workd breaking for codes and SKUs

2013-07-10 Thread Mysurf Mail
Some of the data in my index is SKUs and barcodes as follows
ASDF3-DASDD-2133DD-21H44

I want to disable the wordbreaking for this type (maybe through Regex.
Is there a possible way to do this?


two types of answers in my query

2013-07-09 Thread Mysurf Mail
Hi,
A general question:


Let's say I have Car And CarParts 1:n relation.

And I have discovered that the user had entered in the search field instead
of car name - a part serial number (SKU).
(I discovered it useing regex)

Is there a way to fetch different types of answers in Solr?
Is there a way to fetch mixed types in the answers?
Is there something similiar to that and how is that feature called?

Thank you.


Solr - Delta Query Via Full Import

2013-07-02 Thread Mysurf Mail
I am using DIH to fetch rows from db to solr.
I have many 1:n relations and I can do it only if I use caching (super
fast) Therefor I am adding the following attributes to my inner entity

processor=CachedSqlEntityProcessor cacheKey= cacheLookup=

Everything works great and fast. (First the n tables are queried than the
main entity.)

Now I want configured the delta import. And it is not actually working.

I know that by 
standardhttp://wiki.apache.org/solr/DataImportHandler#Delta-Import_Example
I
need to define the following attributes:

   1. query - Initial Query
   2. DeltaQuery - The rows that were changed
   3. DeltaImportQuery - Fetch the data that was changed
   4. parentDeltaQuery - The Keys of the parent entity that has changed
   rows in the current entity

(2-4 only used in delta queries)

And I have seen in a hack in the
documentshttp://wiki.apache.org/solr/DataImportHandler#Delta-Import_Example
that
you can do delta query via full import.
So instead of adding the following attribute -
Query,deltaImportQuery,deltaQuery -I can just add query and call full
instead of delta.

Problem - Only the first query (main entity) is executed when I run the
full import without clean.

Here is a part of my configuration in data-config.xml (I have left
deltaImportQuery though I call only full import)

entity name=PackageVersion pk=PackageVersionId
query=  select 
from [dbo].[Package] Package inner join
[dbo].[PackageVersion] PackageVersion on Package.Id =
PackageVersion.PackageId
Where '${dataimporter.request.clean}' != 'false'
OR Package.LastModificationTime 
'${dataimporter.last_index_time}' OR PackageVersion.Timestamp 
'${dataimporter.last_index_time}'
deltaImportQuery=select ...
from [dbo].[Package] Package inner join
[dbo].[PackageVersion] PackageVersion on Package.Id =
PackageVersion.PackageId
Where '${dataimporter.request.clean}' != 'false'
OR Package.LastModificationTime 
'${dataimporter.last_index_time}' OR PackageVersion.Timestamp 
'${dataimporter.last_index_time}' and
ID=='${dih.delta.id}'
entity name=PackageTag pk=ResourceId
processor=CachedSqlEntityProcessor cacheKey=ResourceId
cacheLookup=PackageVersion.PackageId
query=  SELECT ResourceId,[Text] PackageTag
from [dbo].[Tag] Tag
Where '${dataimporter.request.clean}' = 'true'
OR Tag.TimeStamp  '${dataimporter.last_index_time}'
parentDeltaQuery=select PackageVersion.PackageVersionId
  from [dbo].[Package] Package
  inner join [dbo].[PackageVersion] PackageVersion
  ON Package.Id = PackageVersion.PackageId
  where Package.Id=${PackageTag.ResourceId}
/entity
/entity


parent Import Query doent run

2013-07-02 Thread Mysurf Mail
I have 1:n relation between my main entity(PackageVersion) and its tag in
my DB.

I add a new tag with this date to the db at the timestamp and I run delta
import command.
the select retrieves the line but i dont see any other sql.
Here are my data-config.xml configurations:

entity name=PackageVersion pk=PackageVersionId
query=  select ...
from [dbo].[Package] Package inner join
[dbo].[PackageVersion] PackageVersion on Package.Id =
PackageVersion.PackageId
deltaQuery = select PackageVersion.Id PackageVersionId
  from [dbo].[Package] Package inner join
[dbo].[PackageVersion] PackageVersion on Package.Id =
PackageVersion.PackageId
  where Package.LastModificationTime 
'${dataimporter.last_index_time}' OR PackageVersion.Timestamp 
'${dataimporter.last_index_time}'
deltaImportQuery=select ...
  from [dbo].[Package] Package inner join
[dbo].[PackageVersion] PackageVersion on Package.Id =
PackageVersion.PackageId
  Where PackageVersionId=='${dih.delta.id}' 

entity name=PackageTag pk=ResourceId
processor=CachedSqlEntityProcessor cacheKey=ResourceId
cacheLookup=PackageVersion.PackageId
query=  SELECT ResourceId,[Text] PackageTag
 from [dbo].[Tag] Tag
deltaQuery=SELECT ResourceId,[Text] PackageTag
from [dbo].[Tag] Tag
Where Tag.TimeStamp 
'${dataimporter.last_index_time}'
parentDeltaQuery=select PackageVersion.PackageVersionId
from [dbo].[Package]
where
Package.Id=${PackageVersion.PackageVersionId}
/entity
/entity


Solr - working with delta import and cache

2013-07-02 Thread Mysurf Mail
I have two entities in 1:n relation - PackageVersion and Tag.
I have configured DIH to use CachedSqlEntityProcessor and everything works
as planned.
First, Tag entity is selected using the query attribute. Then the main
entity.
Ultra Fast.

Now I am adding the delta import.
Everything runs and loads, but too slow.
Looking at the db profiler output i see :

   1. the delta query of the inner entities run first - which is good.
   2. the delta query of the main entities runs later - which is still good.
   3. deltaImportQuery of the main entity with each of the ID's runs as a
   single select can be improved using where in all the result. Is it
   possible?
   4.

   All of the Query attribute of the other tables are running now. This is
   bad. (In real life I have more than one table in 1:n connection). for
   instance I get a lot of

   select ResourceId,[Text] PackageTag
   from [dbo].[Tag] Tag
   Where  ResourceType = 0


run. Because it is from the Query attribute - there is no where clause for
using the ids.
a. How can I fix it ?
b. Can I translate the importquery to use where in
c. There is no real order for all the select when requesting deltaImport.
is it possible to implement the caching also when updating delta?

Here is my configuration

entity name=PackageVersion pk=PackageVersionId
query=  select 
from [dbo].[Package] Package inner join
[dbo].[PackageVersion] PackageVersion on Package.Id =
PackageVersion.PackageId
deltaQuery = select PackageVersion.Id PackageVersionId
  from [dbo].[Package] Package inner join
[dbo].[PackageVersion] PackageVersion on Package.Id =
PackageVersion.PackageId
  where Package.LastModificationTime 
'${dataimporter.last_index_time}' OR PackageVersion.Timestamp 
'${dih.last_index_time}'
deltaImportQuery= select 
from [dbo].[Package] Package inner join
[dbo].[PackageVersion] PackageVersion on Package.Id =
PackageVersion.PackageId
Where PackageVersion.Id='${dih.delta.PackageVersionId}' 

entity name=PackageTag pk=ResourceId
processor=CachedSqlEntityProcessor cacheKey=ResourceId
cacheLookup=PackageVersion.PackageId
query=select ResourceId,[Text] PackageTag
   from [dbo].[Tag] Tag
   Where ResourceType = 0
deltaQuery=select ResourceId,[Text] PackageTag
from [dbo].[Tag] Tag
Where ResourceType = 0 and
Tag.TimeStamp  '${dih.last_index_time}'
parentDeltaQuery=select PackageVersion.PackageVersionId
  from [dbo].[Package]
  where
Package.Id=${PackageTag.ResourceId}
   /entity
/entity


Re: Solr - working with delta import and cache

2013-07-02 Thread Mysurf Mail
BTW: Just found out that a delta import is only supported by the
SqlEntityProcessor .
Does it matter that I defined processor=CachedSqlEntityProcessor?


On Tue, Jul 2, 2013 at 5:58 PM, Mysurf Mail stammail...@gmail.com wrote:

 I have two entities in 1:n relation - PackageVersion and Tag.
 I have configured DIH to use CachedSqlEntityProcessor and everything works
 as planned.
 First, Tag entity is selected using the query attribute. Then the main
 entity.
 Ultra Fast.

 Now I am adding the delta import.
 Everything runs and loads, but too slow.
 Looking at the db profiler output i see :

1. the delta query of the inner entities run first - which is good.
2. the delta query of the main entities runs later - which is still
good.
3. deltaImportQuery of the main entity with each of the ID's runs as a
single select can be improved using where in all the result. Is it
possible?
4.

All of the Query attribute of the other tables are running now. This
is bad. (In real life I have more than one table in 1:n connection). for
instance I get a lot of

select ResourceId,[Text] PackageTag
from [dbo].[Tag] Tag
Where  ResourceType = 0


 run. Because it is from the Query attribute - there is no where clause for
 using the ids.
 a. How can I fix it ?
 b. Can I translate the importquery to use where in
 c. There is no real order for all the select when requesting deltaImport.
 is it possible to implement the caching also when updating delta?

 Here is my configuration

 entity name=PackageVersion pk=PackageVersionId
 query=  select 
 from [dbo].[Package] Package inner join 
 [dbo].[PackageVersion] PackageVersion on Package.Id = 
 PackageVersion.PackageId
 deltaQuery = select PackageVersion.Id PackageVersionId
   from [dbo].[Package] Package inner join 
 [dbo].[PackageVersion] PackageVersion on Package.Id = PackageVersion.PackageId
   where Package.LastModificationTime  
 '${dataimporter.last_index_time}' OR PackageVersion.Timestamp  
 '${dih.last_index_time}'
 deltaImportQuery= select 
 from [dbo].[Package] Package inner join 
 [dbo].[PackageVersion] PackageVersion on Package.Id = PackageVersion.PackageId
 Where PackageVersion.Id='${dih.delta.PackageVersionId}' 

 entity name=PackageTag pk=ResourceId 
 processor=CachedSqlEntityProcessor cacheKey=ResourceId 
 cacheLookup=PackageVersion.PackageId
 query=select ResourceId,[Text] PackageTag
from [dbo].[Tag] Tag
Where ResourceType = 0
 deltaQuery=select ResourceId,[Text] PackageTag
 from [dbo].[Tag] Tag
 Where ResourceType = 0 and Tag.TimeStamp 
  '${dih.last_index_time}'
 parentDeltaQuery=select 
 PackageVersion.PackageVersionId
   from [dbo].[Package]
   where 
 Package.Id=${PackageTag.ResourceId}
/entity
 /entity




Is there a way to speed up my import

2013-06-27 Thread Mysurf Mail
I have a relational database model
This is the basics of my data-config.xml

entity name=MyMainEntity pk=pID query=select ... from [dbo].[TableA]
inner join TableB on ...
 entity name=Entity1 pk=Id1 query=SELECT [Text] Tag from [Table2]
where ResourceId = '${MyMainEntity.pId}'/entity
entity name=Entity1 pk=Id2 query=SELECT [Text] Tag
from [Table2] where ResourceId2 = '${MyMainEntity.pId}'/entity
entity name=LibraryItem pk=ResourceId
query=select SKU
 FROM [TableB]
INNER JOIN ...
ON ...
 INNER JOIN ...
ON ...
WHERE ... AND ...'
 /entity
/entity

Now, this takes a lot of time.
1 rows in the first query and then each other inner entities are
fetched later (around 10 rows each).

If I use a db profiler I see a the three inner entities query running over
and over (3 select sentences than again 3 select sentences over and over)
This is really not efficient.
And the import can run over 40 hrs ()
Now,
What are my options to run it faster .
1. Obviously there is an option to flat the tables to one big table - but
that will create a lot of other side effects.
I would really like to avoid that extra effort and run solr on my
production relational tables.
So far it works great out of the box and I am searching here if there
is a configuration tweak.
2. If I will flat the rows that - does the schema.xml need to be change
too? or the same fields that are multivalued will keep being multivalued.

Thanks.


Re: Is there a way to speed up my import

2013-06-27 Thread Mysurf Mail
I just configured with the caching and it works mighty fast now.
Instead of unbelievable amount queries it queris only 4 times.
CPU usage has moved from the db to the solr computer but only for a very
short time.

Problem :
I dont see the multi value fields (Inner Entities) anymore
This is  my configuration

entity name=PackageVersion pk=PackageVersionId
 query=select PackageVersion.Id PackageVersionId,  from 
entity name=PackageTag pk=ResourceId
processor=CachedSqlEntityProcessor where=ResourceId =
'${PackageVersion.PackageId}'
 query=SELECT [Text] PackageTag from [dbo].[Tag]
/entity
entity name=PackageVersionTag pk=ResourceId
processor=CachedSqlEntityProcessor where=ResourceId =
PackageVersion.PackageVersionId
 query=SELECT [Text] PackageVersionTag from [dbo].[Tag]
/entity
entity name=LibraryItem pk=ResourceId
processor=CachedSqlEntityProcessor where=Asset.[PackageVersionId] =
PackageVersion.PackageVersionId
 query=select CatalogVendorPartNum SKU, LibraryItems.[Description]
SKUDescription
FROM ...
 INNER JOIN ...
ON Asset.Id = LibraryVendors.DesignProjectId
INNER JOIN ...
 ON LibraryVendors.LibraryVendorId = LibraryItems.LibraryVendorId
WHERE Asset.[AssetTypeId]=1
 /entity
/entity

Now, when I query
http://localhost:8983/solr/vaultCache/select?q=*indent=true
it returns only the main entity attriburtes.
Where are my inner entities attributes now?
Thanks a lot.







On Thu, Jun 27, 2013 at 10:15 AM, Gora Mohanty g...@mimirtech.com wrote:

 On 27 June 2013 12:32, Mysurf Mail stammail...@gmail.com wrote:
 
  I have a relational database model
  This is the basics of my data-config.xml
 
  entity name=MyMainEntity pk=pID query=select ... from
 [dbo].[TableA]
  inner join TableB on ...
   entity name=Entity1 pk=Id1 query=SELECT [Text] Tag from [Table2]
  where ResourceId = '${MyMainEntity.pId}'/entity
  entity name=Entity1 pk=Id2 query=SELECT [Text] Tag
  from [Table2] where ResourceId2 = '${MyMainEntity.pId}'/entity
  entity name=LibraryItem pk=ResourceId
  query=select SKU
   FROM [TableB]
  INNER JOIN ...
  ON ...
   INNER JOIN ...
  ON ...
  WHERE ... AND ...'
   /entity
  /entity
 
  Now, this takes a lot of time.
  1 rows in the first query and then each other inner entities are
  fetched later (around 10 rows each).
 
  If I use a db profiler I see a the three inner entities query running
 over
  and over (3 select sentences than again 3 select sentences over and over)
  This is really not efficient.
  And the import can run over 40 hrs ()
  Now,
  What are my options to run it faster .
  1. Obviously there is an option to flat the tables to one big table - but
  that will create a lot of other side effects.
  I would really like to avoid that extra effort and run solr on my
  production relational tables.
  So far it works great out of the box and I am searching here if there
  is a configuration tweak.
  2. If I will flat the rows that - does the schema.xml need to be change
  too? or the same fields that are multivalued will keep being multivalued.

 You have not shared your actual queries, so it is difficult
 to tell, but my guess would be that it is the JOINs that
 are the bottle-neck rather than the SELECTs. You should
 start by:
 1. Profile queries from the database back-end to see
 which are taking the most time, and try to simplify
 them.
 2. Make sure that relevant database columns are indexed.
 This can make a huge difference, though going overboard
  in indexing all columns might be counter-productive.
 3. Use Solr DIH's CachedSqlEntityProcessor:
 http://wiki.apache.org/solr/DataImportHandler#CachedSqlEntityProcessor
 4. Measure the time that Solr indexing takes: From your
 description, you seem to be guessing at it.

 In general, you should not flatten the records in the
 database as that is supposed to be relational data.

 Regards,
 Gora



Need assistance in defining search urls

2013-06-24 Thread Mysurf Mail
Now, each doc looks like this (i generated random user text in the freetext
columns in the DB)
doc str name=PackageNameWe have located the ship./str arr name=
CatalogVendorPartNum strd1771fc0-d3c2-472d-aa33-4bf5d1b79992/str str
b2986a4f-9687-404c-8d45-57b073d900f7/str str
a99cf760-d78e-493f-a827-585d11a765f3/str str
ba349832-c655-4a02-a552-d5b76b45d58c/str str
35e86a61-eba8-49f4-95af-8915bd9561ac/str str
6d8eb7d9-b417-4bda-b544-16bc26ab1d85/str str
31453eff-be19-4193-950f-fffcea70ef9e/str str
08e27e4f-3d07-4ede-a01d-4fdea3f7ddb0/str str
79a19a3f-3f1b-486f-9a84-3fb40c41e9c7/str str
b34c6f78-75b1-42f1-8ec7-e03d874497df/str /arr float name=score
1.7437795/float/doc doc
My searches are :
(PackageName is deined as default search)

1. I try to search for any package that name has the word have or had
or has
2. I try to search for any package that consists
d1771fc0-d3c2-472d-aa33-4bf5d1b79992

Therefore I use this searches

1.
http://localhost:8983/solr/vault/select?q=*have*fl=PackageName%2CscoredefType=edismaxstopwords=truelowercaseOperators=true

questions :
1.a. even if i display all results, I dont get any results with has 
(inflections). Why?
1.b. what is the difference between
*have*http://localhost:8983/solr/vault/select?q=*have*fl=PackageName%2CscoredefType=edismaxstopwords=truelowercaseOperators=true
 and 
havehttp://localhost:8983/solr/vault/select?q=*have*fl=PackageName%2CscoredefType=edismaxstopwords=truelowercaseOperators=true.
the score is differnt.

2.
http://localhost:8983/solr/vault/select?q=*:d1771fc0-d3c2-472d-aa33-4bf5d1b79992fl=PackageName,scoredefType=edismaxstopwords=truelowercaseOperators=truestart=0rows=300

Questions:
2.a. I get no result. even though i search it on all fields. (*) and it
appears in
2.b. If I want to search on more than one field i.e. packageName 
description, what is the best way to do it?
define all as default?
Thanks,


What should be the definitions ( field type ) for a field that will be search with user free text

2013-06-24 Thread Mysurf Mail
currently I am using text_general.
I want to search with user free text search, therefor I would like
tokenization, stemmings ...
How do I define stemmers?
Should I use text_en instead of  text_general?
Thank you.


Re: Need assistance in defining search urls

2013-06-24 Thread Mysurf Mail
Thanks Jack and Giovanni.
Jack:
Regarding 1.b. have vs *have* the results were identical apart from the
score.
Basically i cant do all the stuff you recommended. I want a stemmer for an
unknown search (send the query when user enters free text to a textbox ).

giovanni-  regarding requestHandler  test
will I need to query using /test/...?
shouldnt it be names /test?

.



On Mon, Jun 24, 2013 at 4:40 PM, Jack Krupansky j...@basetechnology.comwrote:

 I don't get any results with has (inflections). Why?

 Wildcard patterns on strings are literal, exact. There is no automatic
 natural language processing.

 You could try a regular expression match:

 q=/ ha(s|ve) /

 Or, just use OR:

 q=*has* OR *have*

 Or, use a copyField of the package name to a text field and than you can
 use simple keywords:

 q=package_name_text:(has OR have)

 Is PackageName a string field?

 Or, maybe best, use an update processor to populate a Boolean field to
 indicate whether the has/have pattern is seen in the package name. A simple
 JavaScript script with a StatelessScriptUpdateProcessor could do this in
 just a couple of lines and make the query much faster.

 For question 1.b the two queries seem identical - was that the case?

 There is no *: feature to query all fields in Solr - although the
 LucidWorks Search query parser does support that feature.

 -- Jack Krupansky

 -Original Message- From: Mysurf Mail
 Sent: Monday, June 24, 2013 7:26 AM
 To: solr-user@lucene.apache.org
 Subject: Need assistance in defining search urls


 Now, each doc looks like this (i generated random user text in the freetext
 columns in the DB)
 doc str name=PackageNameWe have located the ship./str arr name=
 CatalogVendorPartNum strd1771fc0-d3c2-472d-aa33-**4bf5d1b79992/str
 str

 b2986a4f-9687-404c-8d45-**57b073d900f7/str str

 a99cf760-d78e-493f-a827-**585d11a765f3/str str
 ba349832-c655-4a02-a552-**d5b76b45d58c/str str
 35e86a61-eba8-49f4-95af-**8915bd9561ac/str str
 6d8eb7d9-b417-4bda-b544-**16bc26ab1d85/str str
 31453eff-be19-4193-950f-**fffcea70ef9e/str str
 08e27e4f-3d07-4ede-a01d-**4fdea3f7ddb0/str str
 79a19a3f-3f1b-486f-9a84-**3fb40c41e9c7/str str
 b34c6f78-75b1-42f1-8ec7-**e03d874497df/str /arr float name=score
 1.7437795/float/doc doc
 My searches are :
 (PackageName is deined as default search)

 1. I try to search for any package that name has the word have or had
 or has
 2. I try to search for any package that consists
 d1771fc0-d3c2-472d-aa33-**4bf5d1b79992

 Therefore I use this searches

 1.
 http://localhost:8983/solr/**vault/select?q=*have*fl=**
 PackageName%2CscoredefType=**edismaxstopwords=true**
 lowercaseOperators=truehttp://localhost:8983/solr/vault/select?q=*have*fl=PackageName%2CscoredefType=edismaxstopwords=truelowercaseOperators=true

 questions :
 1.a. even if i display all results, I dont get any results with has 
 (inflections). Why?
 1.b. what is the difference between
 *have*http://localhost:8983/**solr/vault/select?q=*have*fl=**
 PackageName%2CscoredefType=**edismaxstopwords=true**
 lowercaseOperators=truehttp://localhost:8983/solr/vault/select?q=*have*fl=PackageName%2CscoredefType=edismaxstopwords=truelowercaseOperators=true
 
 and havehttp://localhost:8983/**solr/vault/select?q=*have*fl=**
 PackageName%2CscoredefType=**edismaxstopwords=true**
 lowercaseOperators=truehttp://localhost:8983/solr/vault/select?q=*have*fl=PackageName%2CscoredefType=edismaxstopwords=truelowercaseOperators=true
 .
 the score is differnt.

 2.
 http://localhost:8983/solr/**vault/select?q=*:d1771fc0-**
 d3c2-472d-aa33-4bf5d1b79992**fl=PackageName,scoredefType=**
 edismaxstopwords=true**lowercaseOperators=truestart=**0rows=300http://localhost:8983/solr/vault/select?q=*:d1771fc0-d3c2-472d-aa33-4bf5d1b79992fl=PackageName,scoredefType=edismaxstopwords=truelowercaseOperators=truestart=0rows=300

 Questions:
 2.a. I get no result. even though i search it on all fields. (*) and it
 appears in
 2.b. If I want to search on more than one field i.e. packageName 
 description, what is the best way to do it?
 define all as default?
 Thanks,



Re: Need assistance in defining search urls

2013-06-24 Thread Mysurf Mail
Regarding
There is no *: feature to query all fields in Solr

When I enter the dashboard - solr/#/[core]/query
the default is *:*
and it brings everything.


On Mon, Jun 24, 2013 at 5:41 PM, Mysurf Mail stammail...@gmail.com wrote:

 Thanks Jack and Giovanni.
 Jack:
 Regarding 1.b. have vs *have* the results were identical apart from the
 score.
 Basically i cant do all the stuff you recommended. I want a stemmer for an
 unknown search (send the query when user enters free text to a textbox ).

 giovanni-  regarding requestHandler  test
 will I need to query using /test/...?
 shouldnt it be names /test?

 .



 On Mon, Jun 24, 2013 at 4:40 PM, Jack Krupansky 
 j...@basetechnology.comwrote:

 I don't get any results with has (inflections). Why?

 Wildcard patterns on strings are literal, exact. There is no automatic
 natural language processing.

 You could try a regular expression match:

 q=/ ha(s|ve) /

 Or, just use OR:

 q=*has* OR *have*

 Or, use a copyField of the package name to a text field and than you
 can use simple keywords:

 q=package_name_text:(has OR have)

 Is PackageName a string field?

 Or, maybe best, use an update processor to populate a Boolean field to
 indicate whether the has/have pattern is seen in the package name. A simple
 JavaScript script with a StatelessScriptUpdateProcessor could do this in
 just a couple of lines and make the query much faster.

 For question 1.b the two queries seem identical - was that the case?

 There is no *: feature to query all fields in Solr - although the
 LucidWorks Search query parser does support that feature.

 -- Jack Krupansky

 -Original Message- From: Mysurf Mail
 Sent: Monday, June 24, 2013 7:26 AM
 To: solr-user@lucene.apache.org
 Subject: Need assistance in defining search urls


 Now, each doc looks like this (i generated random user text in the
 freetext
 columns in the DB)
 doc str name=PackageNameWe have located the ship./str arr name=
 CatalogVendorPartNum strd1771fc0-d3c2-472d-aa33-**4bf5d1b79992/str
 str

 b2986a4f-9687-404c-8d45-**57b073d900f7/str str

 a99cf760-d78e-493f-a827-**585d11a765f3/str str
 ba349832-c655-4a02-a552-**d5b76b45d58c/str str
 35e86a61-eba8-49f4-95af-**8915bd9561ac/str str
 6d8eb7d9-b417-4bda-b544-**16bc26ab1d85/str str
 31453eff-be19-4193-950f-**fffcea70ef9e/str str
 08e27e4f-3d07-4ede-a01d-**4fdea3f7ddb0/str str
 79a19a3f-3f1b-486f-9a84-**3fb40c41e9c7/str str
 b34c6f78-75b1-42f1-8ec7-**e03d874497df/str /arr float name=score
 1.7437795/float/doc doc
 My searches are :
 (PackageName is deined as default search)

 1. I try to search for any package that name has the word have or had
 or has
 2. I try to search for any package that consists
 d1771fc0-d3c2-472d-aa33-**4bf5d1b79992

 Therefore I use this searches

 1.
 http://localhost:8983/solr/**vault/select?q=*have*fl=**
 PackageName%2CscoredefType=**edismaxstopwords=true**
 lowercaseOperators=truehttp://localhost:8983/solr/vault/select?q=*have*fl=PackageName%2CscoredefType=edismaxstopwords=truelowercaseOperators=true

 questions :
 1.a. even if i display all results, I dont get any results with has 
 (inflections). Why?
 1.b. what is the difference between
 *have*http://localhost:8983/**solr/vault/select?q=*have*fl=**
 PackageName%2CscoredefType=**edismaxstopwords=true**
 lowercaseOperators=truehttp://localhost:8983/solr/vault/select?q=*have*fl=PackageName%2CscoredefType=edismaxstopwords=truelowercaseOperators=true
 
 and havehttp://localhost:8983/**solr/vault/select?q=*have*fl=**
 PackageName%2CscoredefType=**edismaxstopwords=true**
 lowercaseOperators=truehttp://localhost:8983/solr/vault/select?q=*have*fl=PackageName%2CscoredefType=edismaxstopwords=truelowercaseOperators=true
 .
  the score is differnt.

 2.
 http://localhost:8983/solr/**vault/select?q=*:d1771fc0-**
 d3c2-472d-aa33-4bf5d1b79992**fl=PackageName,scoredefType=**
 edismaxstopwords=true**lowercaseOperators=truestart=**0rows=300http://localhost:8983/solr/vault/select?q=*:d1771fc0-d3c2-472d-aa33-4bf5d1b79992fl=PackageName,scoredefType=edismaxstopwords=truelowercaseOperators=truestart=0rows=300

 Questions:
 2.a. I get no result. even though i search it on all fields. (*) and it
 appears in
 2.b. If I want to search on more than one field i.e. packageName 
 description, what is the best way to do it?
 define all as default?
 Thanks,





why does the uniqueKey has to be indexed.

2013-06-24 Thread Mysurf Mail
Currently, I cant describe my unique key with indexed false.

As I understand from the docs the field attribute indexed should be true
only if i want the field to be searchable or sortable.

Let's say I have a schema with id and name only, wouldn't I want the
following configuration
id - indexed false, stored = true
name indexed true, stored = true

I don't want the id to be searched but I would want it to be defined as the
unique key and to be stored (for retrieval).


Re: What should be the definitions ( field type ) for a field that will be search with user free text

2013-06-24 Thread Mysurf Mail
Thanks.


On Mon, Jun 24, 2013 at 5:52 PM, Jack Krupansky j...@basetechnology.comwrote:

 The general idea is that tokenization can generally be done in a
 language-independent manner, but stemming, synonyms, stop words, etc. must
 be done in a language-dependent manner.

 So, yes, text_en is a better starting point for adding in the more
 advanced language processing features.

 -- Jack Krupansky

 -Original Message- From: Mysurf Mail
 Sent: Monday, June 24, 2013 10:26 AM
 To: solr-user@lucene.apache.org
 Subject: What should be the definitions ( field type ) for a field that
 will be search with user free text


 currently I am using text_general.
 I want to search with user free text search, therefor I would like
 tokenization, stemmings ...
 How do I define stemmers?
 Should I use text_en instead of  text_general?
 Thank you.



Re: modeling multiple values on 1:n connection

2013-06-23 Thread Mysurf Mail
Thanks for your comment.
What I need is to model it so that i can connect between the featureName
and the feature description of the.
Currently if item has 3 features I get two list - each three elements long.
But then I need to correlate them.



On Sun, Jun 23, 2013 at 9:25 AM, Gora Mohanty g...@mimirtech.com wrote:

 On 23 June 2013 01:31, Mysurf Mail stammail...@gmail.com wrote:
  I try to model my db using
  thishttp://wiki.apache.org/solr/DataImportHandler#Full_Import_Example
 example
  from solr wiki.
 
  I have a table called item and a table called features with
  id,featureName,description
 
  here is the updated xml (added featureName)
 
  dataConfig
  dataSource driver=org.hsqldb.jdbcDriver
  url=jdbc:hsqldb:/temp/example/ex user=sa /
  document
  entity name=item query=select * from item
  entity name=feature query=select description,
  featureName as features from feature where item_id='${item.ID}'/
  /entity
  /document
 
 
  Now I get two lists in the xml element
 
  doc
  arr name=featureName
  strnumber of miles in every direction the universal cataclysm was
  gathering/str strAll around the Restaurant people and things relaxed
  and chatted. The/str str- Do we have... - he put up a hand to hold
 back
  the cheers, - Do we/str /arr arr name=description
  strto a stupefying climax. Glancing at his watch, Max returned to the
  stage/str strair was filled with talk of this and that, and with the
  mingled scents of/str strhave a party here from the Zansellquasure
  Flamarion Bridge Club from/str
  /arr.
  /doc
  But I would like to see the list together (using xml attributes) so that
 I
  dont have to join the values.
  Is it possible?

 While it is not clear to me what you are asking, I am
 guessing that you do not want the featureName and
 description fields to appear as arrays. This is happening
 because you have defined them as multi-valued in the
 Solr schema. What exactly do you want to join here?

 Regards,
 Gora



modeling multiple values on 1:n connection

2013-06-22 Thread Mysurf Mail
I try to model my db using
thishttp://wiki.apache.org/solr/DataImportHandler#Full_Import_Exampleexample
from solr wiki.

I have a table called item and a table called features with
id,featureName,description

here is the updated xml (added featureName)

dataConfig
dataSource driver=org.hsqldb.jdbcDriver
url=jdbc:hsqldb:/temp/example/ex user=sa /
document
entity name=item query=select * from item
entity name=feature query=select description,
featureName as features from feature where item_id='${item.ID}'/
/entity
/document


Now I get two lists in the xml element

doc
arr name=featureName
strnumber of miles in every direction the universal cataclysm was
gathering/str strAll around the Restaurant people and things relaxed
and chatted. The/str str- Do we have... - he put up a hand to hold back
the cheers, - Do we/str /arr arr name=description
strto a stupefying climax. Glancing at his watch, Max returned to the
stage/str strair was filled with talk of this and that, and with the
mingled scents of/str strhave a party here from the Zansellquasure
Flamarion Bridge Club from/str
/arr.
/doc
But I would like to see the list together (using xml attributes) so that I
dont have to join the values.
Is it possible?


Re: How to define my data in schema.xml

2013-06-19 Thread Mysurf Mail
Well,
Avoiding flattening the db to a flat table sounds like a great plan.
I found this solution
http://wiki.apache.org/solr/DataImportHandler#Full_Import_Example

import.a join. not handling a flat table.



On Tue, Jun 18, 2013 at 5:53 PM, Jack Krupansky j...@basetechnology.comwrote:

 You can in fact have multiple collections in Solr and do a limited amount
 of joining, and Solr has multivalued fields as well, but none of those
 techniques should be used to avoid the process of flattening and
 denormalizing a relational data model. It is hard work, but yes, it is
 required to use Solr effectively.

 Again, start with the queries - what problem are you trying to solve.
 Nobody stores data just for the sake of storing it - how will the data be
 used?


 -- Jack Krupansky

 -Original Message- From: Mysurf Mail
 Sent: Tuesday, June 18, 2013 9:58 AM

 To: solr-user@lucene.apache.org
 Subject: Re: How to define my data in schema.xml

 Hi Jack,
 Thanks, for you kind comment.

 I am truly in the beginning of data modeling my schema over an existing
 working DB.
 I have used the school-teachers-student db as an example scenario.
 (a, I have written it as a disclaimer in my first post. b. I really do not
 know anyone that has 300 hobbies too.)

 In real life my db is obviously much different,
 I just used this as an example of potential pitfalls that will occur if I
 use my old db data modeling notions.
 obviously, the old relational modeling idioms do not apply here.

 Now, my question was referring to the fact that I would really like to
 avoid a flat table/join/view because of the reason listed above.
 So, my scenario is answering a plain user generated text search over a
 MSSQLDB that contains a few 1:n relation (and a few 1:n:n relationship).

 So, I come here for tips. Should I use one combined index (treat it as a
 nosql source) or separate indices or another. any other ways to define
 relation data ?
 Thanks.



 On Tue, Jun 18, 2013 at 4:30 PM, Jack Krupansky j...@basetechnology.com*
 *wrote:

  It sounds like you still have a lot of work to do on your data model. No
 matter how you slice it, 8 billion rows/fields/whatever is still way too
 much for any engine to search on a single server. If you have 8 billion of
 anything, a heavily sharded SolrCloud cluster is probably warranted. Don't
 plan ahead to put more than 100 million rows on a single node; plan on a
 proof of concept implementation to determine that number.

 When we in Solr land say flattened or denormalized, we mean in an
 intelligent, smart, thoughtful sense, not a mindless, mechanical
 flattening. It is an opportunity for you to reconsider your data models,
 both old and new.

 Maybe data modeling is beyond your skill set. If so, have a chat with your
 boss and ask for some assistance, training, whatever.

 Actually, I am suspicious of your 8 billion number - change each of those
 300's to realistic, average numbers. Each teacher teaches 300 courses?
 Right. Each Student has 300 hobbies? If you say so, but...

 Don't worry about schema.xml until you get your data model under control.

 For an initial focus, try envisioning the use cases for user queries. That
 will guide you in thinking about how the data would need to be organized
 to
 satisfy those user queries.

 -- Jack Krupansky

 -Original Message- From: Mysurf Mail
 Sent: Tuesday, June 18, 2013 2:20 AM
 To: solr-user@lucene.apache.org
 Subject: Re: How to define my data in schema.xml


 Thanks for your reply.
 I have tried the simplest approach and it works absolutely fantastic.
 Huge table - 0s to result.

 two problems as I described earlier, and that is what I try to solve:
 1. I create a flat table just for solar. This requires maintenance and
 develop. Can I run solr over my regular tables?
This is my simplest approach. Working over my relational tables,
 2. When you query a flat table by school name, as I described, if the
 school has 300 student, 300 teachers, 300  with 300 teacherCourses, 300
 studentHobbies,
you get 8.1 Billion rows (300*300*300*300). As I am sure this will work
 great on solar - searching for the school name will retrieve 8.1 B rows.
 3. Lets say all my searches are user generated free text search that is
 searching name and comments columns.
 Thanks.


 On Tue, Jun 18, 2013 at 7:32 AM, Gora Mohanty g...@mimirtech.com wrote:

  On 18 June 2013 01:10, Mysurf Mail stammail...@gmail.com wrote:

  Thanks for your quick reply. Here are some notes:
 
  1. Consider that all tables in my example have two columns: Name 
  Description which I would like to index and search.
  2. I have no other reason to create flat table other than for solar. So
  I
  would like to see if I can avoid it.
  3. If in my example I will have a flat table then obviously it will 
 hold
 a
  lot of rows for a single school.
  By searching the exact school name I will likely receive a lot of
 rows.
  (my flat table has its own pk)

 Yes, all

Re: Solr data files

2013-06-18 Thread Mysurf Mail
Thanks.,


On Mon, Jun 17, 2013 at 10:42 PM, Alexandre Rafalovitch
arafa...@gmail.comwrote:

 The index files are under the the collection's directory in the
 subdirectory called 'data'. Right next to the directory called 'conf'
 where your schema.xml and solrconfig.xml live.

 If the Solr is not running, you can delete that directory to clear the
 index content. I don't think you can do that while Solr is running.

 Regards,
Alex.
 Personal website: http://www.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)


 On Mon, Jun 17, 2013 at 3:33 PM, Mysurf Mail stammail...@gmail.com
 wrote:
  Where are the core data files located?
  Can I just delete folder/files in order to quick clean the core/indexes?
  Thanks



Re: How to define my data in schema.xml

2013-06-18 Thread Mysurf Mail
Thanks for your reply.
I have tried the simplest approach and it works absolutely fantastic.
Huge table - 0s to result.

two problems as I described earlier, and that is what I try to solve:
1. I create a flat table just for solar. This requires maintenance and
develop. Can I run solr over my regular tables?
This is my simplest approach. Working over my relational tables,
2. When you query a flat table by school name, as I described, if the
school has 300 student, 300 teachers, 300  with 300 teacherCourses, 300
studentHobbies,
you get 8.1 Billion rows (300*300*300*300). As I am sure this will work
great on solar - searching for the school name will retrieve 8.1 B rows.
3. Lets say all my searches are user generated free text search that is
searching name and comments columns.
Thanks.


On Tue, Jun 18, 2013 at 7:32 AM, Gora Mohanty g...@mimirtech.com wrote:

 On 18 June 2013 01:10, Mysurf Mail stammail...@gmail.com wrote:
  Thanks for your quick reply. Here are some notes:
 
  1. Consider that all tables in my example have two columns: Name 
  Description which I would like to index and search.
  2. I have no other reason to create flat table other than for solar. So I
  would like to see if I can avoid it.
  3. If in my example I will have a flat table then obviously it will hold
 a
  lot of rows for a single school.
  By searching the exact school name I will likely receive a lot of
 rows.
  (my flat table has its own pk)

 Yes, all of this is definitely the case, but in practice
 it does not matter. Solr can efficiently search through
 millions of rows. To start with, just try the simplest
 approach, and only complicate things as and when
 needed.

  That is something I would like to avoid and I thought I can avoid
 this
  by defining teachers and students as multiple value or something like
 this
  and than teacherCourses and studentHobbies  as 1:n respectively.
  This is quite similiar to my real life demand, so I came here to get
  some tips as a solr noob.

 You have still not described what are the searches that
 you would want to do. Again, I would suggest starting
 with the most straightforward approach.

 Regards,
 Gora



implementing identity authentication in SOLR

2013-06-18 Thread Mysurf Mail
Hi,
In order to add solr to my prod environmnet I have to implement some
security restriction.
Is there a way to add user/pass to the requests and to keep them
*encrypted*in a file.
thanks.


Re: implementing identity authentication in SOLR

2013-06-18 Thread Mysurf Mail
Just to make sure.
In my previous question I was referring to the user/pass that queries the
db.

Now I was referring to the user/pass that i want for the solr http request.
Think of it as if my user sends a request where he filter documents created
by another user.
I want to restrict that.

I currently work in a .NET environment where we have identity provider that
provides trusted claims to the http request.
In similar situations I take the user name property from a trusted claim
and not from a parameter in the url .

I want to know how solr can restrict his http request/responses.
Thank you.


On Tue, Jun 18, 2013 at 10:56 AM, Gora Mohanty g...@mimirtech.com wrote:

 On 18 June 2013 13:10, Mysurf Mail stammail...@gmail.com wrote:
  Hi,
  In order to add solr to my prod environmnet I have to implement some
  security restriction.
  Is there a way to add user/pass to the requests and to keep them
  *encrypted*in a file.

 As mentioned earlier, no there is no built-in way of doing that
 if you are using the Solr DataImportHandler.

 Probably the easiest way would be to implement your own
 indexing using a library like SolrJ. Then, you can handle encryption
 as you wish.

 Regards,
 Gora



Re: Need assistance in defining solr to process user generated query text

2013-06-18 Thread Mysurf Mail
great tip :-)


On Tue, Jun 18, 2013 at 2:36 PM, Erick Erickson erickerick...@gmail.comwrote:

 if the _solr_ type is string, then you aren't getting any
 tokenization, so my dog has fleas is indexed as
 my dog has fleas, a single token. To search
 for individual words you need to use, say, the
 text_general type, which would index
 my dog has fleas

 Best
 Erick

 On Mon, Jun 17, 2013 at 11:26 AM, Mysurf Mail stammail...@gmail.com
 wrote:
  I have one fact table with a lot of string columns and a few GUIDs just
 for
  retreival (Not for search)
 
 
 
  On Mon, Jun 17, 2013 at 6:01 PM, Jack Krupansky j...@basetechnology.com
 wrote:
 
  It sounds like you have your text indexed in a string field (why the
  wildcards are needed), or that maybe you are using the keyword
 tokenizer
  rather than the standard tokenizer.
 
  What is your default or query fields for dismax/edismax? And what are
 the
  field types for those fields?
 
  -- Jack Krupansky
 
  -Original Message- From: Mysurf Mail
  Sent: Monday, June 17, 2013 10:51 AM
  To: solr-user@lucene.apache.org
  Subject: Need assistance in defining solr to process user generated
 query
  text
 
 
  Hi,
  I have been reading solr wiki pages and configured solr successfully
 over
  my flat table.
  I have a few question though regarding the querying and parsing of user
  generated text.
 
  1. I have understood through this http://wiki.apache.org/solr/**DisMax
 http://wiki.apache.org/solr/DisMaxpage
  that
 
  I want to use dismax.
 Through this http://wiki.apache.org/solr/**LocalParams
 http://wiki.apache.org/solr/LocalParamspage
  I can do it
 
  using localparams
 
 But I think the best way is to define this in my xml files.
 Can I do this?
 
  2.in this http://lucene.apache.org/**solr/4_3_0/tutorial.html
 http://lucene.apache.org/solr/4_3_0/tutorial.html
  **tutorial
 
  (solr) the following query appears
 
 http://localhost:8983/solr/#/**collection1/query?q=video
 http://localhost:8983/solr/#/collection1/query?q=video
 
 When I want to query my fact table  I have to query using *video*.
 just video retrieves nothing.
 How can I query it using video only?
  3. In this http://wiki.apache.org/solr/**ExtendedDisMax#Configuration
 http://wiki.apache.org/solr/ExtendedDisMax#Configuration
  **page
 
  it says that
  Extended DisMax is already configured in the example configuration,
 with
  the name edismax
  But I see it only in the /browse requestHandler
  as follows:
 
 
  requestHandler name=/browse class=solr.SearchHandler
  lst name=defaults
str name=echoParamsexplicit/**str
 ...
 !-- Query settings --
str name=defTypeedismax/str
 
  Do I use it also when I use select in my url ?
 
  4. In general, I want to transfer a user generated text to my url
 request
  using the most standard rules (translate ,+,- signs to the q parameter
  value).
  What is the best way to
 
 
 
  Thanks.
 



Re: Is there a way to encrypt username and pass in the solr config file

2013-06-18 Thread Mysurf Mail
@Gora: yes.
User name and pass.


On Tue, Jun 18, 2013 at 2:57 PM, Gora Mohanty g...@mimirtech.com wrote:

 On 18 June 2013 17:16, Erick Erickson erickerick...@gmail.com wrote:
  What do you mean encrypt? The stored value?
  the indexed value? Over the wire?
 [...]

 My understanding was that he wanted to encrypt the
 username/password in the DIH configuration file.
 Mysurf Mail, could you please clarify?

 Regards,
 Gora



Re: How to define my data in schema.xml

2013-06-18 Thread Mysurf Mail
Hi Jack,
Thanks, for you kind comment.

I am truly in the beginning of data modeling my schema over an existing
working DB.
I have used the school-teachers-student db as an example scenario.
(a, I have written it as a disclaimer in my first post. b. I really do not
know anyone that has 300 hobbies too.)

In real life my db is obviously much different,
I just used this as an example of potential pitfalls that will occur if I
use my old db data modeling notions.
obviously, the old relational modeling idioms do not apply here.

Now, my question was referring to the fact that I would really like to
avoid a flat table/join/view because of the reason listed above.
So, my scenario is answering a plain user generated text search over a
MSSQLDB that contains a few 1:n relation (and a few 1:n:n relationship).

So, I come here for tips. Should I use one combined index (treat it as a
nosql source) or separate indices or another. any other ways to define
relation data ?
Thanks.



On Tue, Jun 18, 2013 at 4:30 PM, Jack Krupansky j...@basetechnology.comwrote:

 It sounds like you still have a lot of work to do on your data model. No
 matter how you slice it, 8 billion rows/fields/whatever is still way too
 much for any engine to search on a single server. If you have 8 billion of
 anything, a heavily sharded SolrCloud cluster is probably warranted. Don't
 plan ahead to put more than 100 million rows on a single node; plan on a
 proof of concept implementation to determine that number.

 When we in Solr land say flattened or denormalized, we mean in an
 intelligent, smart, thoughtful sense, not a mindless, mechanical
 flattening. It is an opportunity for you to reconsider your data models,
 both old and new.

 Maybe data modeling is beyond your skill set. If so, have a chat with your
 boss and ask for some assistance, training, whatever.

 Actually, I am suspicious of your 8 billion number - change each of those
 300's to realistic, average numbers. Each teacher teaches 300 courses?
 Right. Each Student has 300 hobbies? If you say so, but...

 Don't worry about schema.xml until you get your data model under control.

 For an initial focus, try envisioning the use cases for user queries. That
 will guide you in thinking about how the data would need to be organized to
 satisfy those user queries.

 -- Jack Krupansky

 -Original Message- From: Mysurf Mail
 Sent: Tuesday, June 18, 2013 2:20 AM
 To: solr-user@lucene.apache.org
 Subject: Re: How to define my data in schema.xml


 Thanks for your reply.
 I have tried the simplest approach and it works absolutely fantastic.
 Huge table - 0s to result.

 two problems as I described earlier, and that is what I try to solve:
 1. I create a flat table just for solar. This requires maintenance and
 develop. Can I run solr over my regular tables?
This is my simplest approach. Working over my relational tables,
 2. When you query a flat table by school name, as I described, if the
 school has 300 student, 300 teachers, 300  with 300 teacherCourses, 300
 studentHobbies,
you get 8.1 Billion rows (300*300*300*300). As I am sure this will work
 great on solar - searching for the school name will retrieve 8.1 B rows.
 3. Lets say all my searches are user generated free text search that is
 searching name and comments columns.
 Thanks.


 On Tue, Jun 18, 2013 at 7:32 AM, Gora Mohanty g...@mimirtech.com wrote:

  On 18 June 2013 01:10, Mysurf Mail stammail...@gmail.com wrote:
  Thanks for your quick reply. Here are some notes:
 
  1. Consider that all tables in my example have two columns: Name 
  Description which I would like to index and search.
  2. I have no other reason to create flat table other than for solar. So
  I
  would like to see if I can avoid it.
  3. If in my example I will have a flat table then obviously it will hold
 a
  lot of rows for a single school.
  By searching the exact school name I will likely receive a lot of
 rows.
  (my flat table has its own pk)

 Yes, all of this is definitely the case, but in practice
 it does not matter. Solr can efficiently search through
 millions of rows. To start with, just try the simplest
 approach, and only complicate things as and when
 needed.

  That is something I would like to avoid and I thought I can avoid
 this
  by defining teachers and students as multiple value or something like
 this
  and than teacherCourses and studentHobbies  as 1:n respectively.
  This is quite similiar to my real life demand, so I came here to get
  some tips as a solr noob.

 You have still not described what are the searches that
 you would want to do. Again, I would suggest starting
 with the most straightforward approach.

 Regards,
 Gora





Need assistance in defining solr to process user generated query text

2013-06-17 Thread Mysurf Mail
Hi,
I have been reading solr wiki pages and configured solr successfully over
my flat table.
I have a few question though regarding the querying and parsing of user
generated text.

1. I have understood through this http://wiki.apache.org/solr/DisMaxpage that
I want to use dismax.
Through this http://wiki.apache.org/solr/LocalParamspage I can do it
using localparams

But I think the best way is to define this in my xml files.
Can I do this?

2.in this http://lucene.apache.org/solr/4_3_0/tutorial.htmltutorial
(solr) the following query appears

http://localhost:8983/solr/#/collection1/query?q=video

When I want to query my fact table  I have to query using *video*.
just video retrieves nothing.
How can I query it using video only?
3. In this http://wiki.apache.org/solr/ExtendedDisMax#Configurationpage
it says that
Extended DisMax is already configured in the example configuration, with
the name edismax
But I see it only in the /browse requestHandler
as follows:


requestHandler name=/browse class=solr.SearchHandler
 lst name=defaults
   str name=echoParamsexplicit/str
...
!-- Query settings --
   str name=defTypeedismax/str

Do I use it also when I use select in my url ?

4. In general, I want to transfer a user generated text to my url request
using the most standard rules (translate ,+,- signs to the q parameter
value).
What is the best way to



Thanks.


Re: Need assistance in defining solr to process user generated query text

2013-06-17 Thread Mysurf Mail
I have one fact table with a lot of string columns and a few GUIDs just for
retreival (Not for search)



On Mon, Jun 17, 2013 at 6:01 PM, Jack Krupansky j...@basetechnology.comwrote:

 It sounds like you have your text indexed in a string field (why the
 wildcards are needed), or that maybe you are using the keyword tokenizer
 rather than the standard tokenizer.

 What is your default or query fields for dismax/edismax? And what are the
 field types for those fields?

 -- Jack Krupansky

 -Original Message- From: Mysurf Mail
 Sent: Monday, June 17, 2013 10:51 AM
 To: solr-user@lucene.apache.org
 Subject: Need assistance in defining solr to process user generated query
 text


 Hi,
 I have been reading solr wiki pages and configured solr successfully over
 my flat table.
 I have a few question though regarding the querying and parsing of user
 generated text.

 1. I have understood through this 
 http://wiki.apache.org/solr/**DisMaxhttp://wiki.apache.org/solr/DisMaxpage
 that

 I want to use dismax.
Through this 
 http://wiki.apache.org/solr/**LocalParamshttp://wiki.apache.org/solr/LocalParamspage
 I can do it

 using localparams

But I think the best way is to define this in my xml files.
Can I do this?

 2.in this 
 http://lucene.apache.org/**solr/4_3_0/tutorial.htmlhttp://lucene.apache.org/solr/4_3_0/tutorial.html
 **tutorial

 (solr) the following query appears


 http://localhost:8983/solr/#/**collection1/query?q=videohttp://localhost:8983/solr/#/collection1/query?q=video

When I want to query my fact table  I have to query using *video*.
just video retrieves nothing.
How can I query it using video only?
 3. In this 
 http://wiki.apache.org/solr/**ExtendedDisMax#Configurationhttp://wiki.apache.org/solr/ExtendedDisMax#Configuration
 **page

 it says that
 Extended DisMax is already configured in the example configuration, with
 the name edismax
 But I see it only in the /browse requestHandler
 as follows:


 requestHandler name=/browse class=solr.SearchHandler
 lst name=defaults
   str name=echoParamsexplicit/**str
...
!-- Query settings --
   str name=defTypeedismax/str

 Do I use it also when I use select in my url ?

 4. In general, I want to transfer a user generated text to my url request
 using the most standard rules (translate ,+,- signs to the q parameter
 value).
 What is the best way to



 Thanks.



How to define my data in schema.xml

2013-06-17 Thread Mysurf Mail
Hi,
I have created a flat table from my DB and defined a solr core on it.
It works excellent so far.

My problem is that my table has two hierarchies. So when flatted it is too
big.
Lets consider the following example scenario

My Tables are

School
Students (1:n with school)
Teachers(1:n with school)

Now, each school has many students and teachers but each student/teacher
has another multivalue field. i.e. the following table

studentHobbies - 1:N with students
teacherCourses - 1:N with teachers

My main Entity is School and that what I want to get in the result.
Flattening does not help me much and is very expensive.

Can you direct me to how I define 1:n relationships  ( and 1:n:n)
In data-config.xml
Thanks.


Is there a way to encrypt username and pass in the solr config file

2013-06-17 Thread Mysurf Mail
Hi,
I want to encrypt (rsa maybe?) my user name/pass in solr .
Cant leave a simple plain text on the server.
What is the recomended way?
Thanks.


Solr data files

2013-06-17 Thread Mysurf Mail
Where are the core data files located?
Can I just delete folder/files in order to quick clean the core/indexes?
Thanks


Re: How to define my data in schema.xml

2013-06-17 Thread Mysurf Mail
Thanks for your quick reply. Here are some notes:

1. Consider that all tables in my example have two columns: Name 
Description which I would like to index and search.
2. I have no other reason to create flat table other than for solar. So I
would like to see if I can avoid it.
3. If in my example I will have a flat table then obviously it will hold a
lot of rows for a single school.
By searching the exact school name I will likely receive a lot of rows.
(my flat table has its own pk)
That is something I would like to avoid and I thought I can avoid this
by defining teachers and students as multiple value or something like this
and than teacherCourses and studentHobbies  as 1:n respectively.
This is quite similiar to my real life demand, so I came here to get
some tips as a solr noob.


On Mon, Jun 17, 2013 at 9:08 PM, Gora Mohanty g...@mimirtech.com wrote:

 On 17 June 2013 21:39, Mysurf Mail stammail...@gmail.com wrote:
  Hi,
  I have created a flat table from my DB and defined a solr core on it.
  It works excellent so far.
 
  My problem is that my table has two hierarchies. So when flatted it is
 too
  big.

 What do you mean by too big? Have you actually tried
 indexing the data into Solr, and does the performance
 not meet your needs, or are you guessing from the size
 of the tables?

  Lets consider the following example scenario
 
  My Tables are
 
  School
  Students (1:n with school)
  Teachers(1:n with school)
 [...]

 Um, all of this crucially depends on what your 'n' is.
 Plus, you need to describe your use case in much
 more detail. At the moment, you are asking us to
 guess at what you are trying to do, which is inefficient,
 and unlikely to solve your problem.

 Regards,
 Gora



Re: Estimating the required volume to

2013-06-03 Thread Mysurf Mail
Thanks for your answer.
Can you please elaborate on
mssql text searching is pretty primitive compared to Solr
(Link or anything)
Thanks.


On Sun, Jun 2, 2013 at 4:54 PM, Erick Erickson erickerick...@gmail.comwrote:

 1 Maybe, maybe not. mssql text searching is pretty primitive
 compared to Solr, just as Solr's db-like operations are
 primitive compared to mssql. They address different use-cases.

 So, you can store the docs in Solr and not touch your SQL db
 at all to return the docs. You can store just the IDs in Solr and
 retrieve your docs from the SQL store. You can store just
 enough data in Solr to display the results page and when the user
 tries to drill down you can go to your SQL database for assembling
 the full document. You can. It all depend on your use case, data
size, all that rot.

Very often, something like the DB is considered the system-of-record
and it's indexed to Solr (See DIH or SolrJ) periodically.

   There is no underlying connection between your SQL store and Solr.
   You control when data is fetched from SQL and put into Solr. You
control what the search experience is. etc.

 2 Not really :(.  See:

 http://searchhub.org/dev/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

 Best
 Erick

 On Sat, Jun 1, 2013 at 1:07 PM, Mysurf Mail stammail...@gmail.com wrote:
  Hi,
 
  I am just starting to learn about solr.
  I want to test it in my env working with ms sql server.
 
  I have followed the tutorial and imported some rows to the Solr.
  Now I have a few noob question regarding the benefits of implementing
 Solr
  on a sql environment.
 
  1. As I understand, When I send a query request over http, I receive a
  result with ID from the Solr system and than I query the full object row
  from the db.
  Is that right?
  Is there a comparison  next to ms sql full text search which retrieves
 the
  full object in the same select?
  Is there a comparison that relates to db/server cluster and multiple
  machines?
  2. Is there a technic that will assist me to estimate the volume size I
  will need for the indexed data (obviously, based on the indexed data
  properties) ?



Re: Estimating the required volume to

2013-06-03 Thread Mysurf Mail
Hi,
Thanks for your answer.
I want to refer to your message, because I am trying to choose the right
tool.


1. regarding stemming:
I am running in ms-sql

SELECT * FROM sys.dm_fts_parser ('FORMSOF(INFLECTIONAL,provide)', 1033,
0, 0)

and I receive

group_id phrase_id occurrence special_term display_term expansion_type
source_term
1 0 1 Exact Match *provided *2 provide
1 0 1 Exact Match *provides  *2 provide
1 0 1 Exact Match *providing *2 provide
1 0 1 Exact Match *provide *0 provide

isnt that stemming ?
2. Regarding synonyms
sql server has a full thesaurus
featurehttp://msdn.microsoft.com/en-us/library/ms142491.aspx.
Doesnt it mean synonyms?


On Mon, Jun 3, 2013 at 2:43 PM, Erick Erickson erickerick...@gmail.comwrote:

 Here's a link to various transformations you can do
 while indexing and searching in Solr:
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
 Consider
  stemming
  ngrams
  WordDelimiterFilterFactory
  ASCIIFoldingFilterFactory
  phrase queries
  boosting
  synonyms
  blah blah blah

 You can't do a lot of these transformations, at least not easily
 in SQL. OTOH, you can't do 5-way joins in Solr. Different problems,
 different tools

 All that said, there's no good reason to use Solr if your use-case
 is satisfied by simple keyword searches that have no transformations,
 mysql etc. work just fine in those cases. It's all about selecting the
 right tool for the use-case.

 FWIW,
 Erick

 On Mon, Jun 3, 2013 at 4:44 AM, Mysurf Mail stammail...@gmail.com wrote:
  Thanks for your answer.
  Can you please elaborate on
  mssql text searching is pretty primitive compared to Solr
  (Link or anything)
  Thanks.
 
 
  On Sun, Jun 2, 2013 at 4:54 PM, Erick Erickson erickerick...@gmail.com
 wrote:
 
  1 Maybe, maybe not. mssql text searching is pretty primitive
  compared to Solr, just as Solr's db-like operations are
  primitive compared to mssql. They address different use-cases.
 
  So, you can store the docs in Solr and not touch your SQL db
  at all to return the docs. You can store just the IDs in Solr and
  retrieve your docs from the SQL store. You can store just
  enough data in Solr to display the results page and when the user
  tries to drill down you can go to your SQL database for assembling
  the full document. You can. It all depend on your use case, data
 size, all that rot.
 
 Very often, something like the DB is considered the system-of-record
 and it's indexed to Solr (See DIH or SolrJ) periodically.
 
There is no underlying connection between your SQL store and Solr.
You control when data is fetched from SQL and put into Solr. You
 control what the search experience is. etc.
 
  2 Not really :(.  See:
 
 
 http://searchhub.org/dev/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
 
  Best
  Erick
 
  On Sat, Jun 1, 2013 at 1:07 PM, Mysurf Mail stammail...@gmail.com
 wrote:
   Hi,
  
   I am just starting to learn about solr.
   I want to test it in my env working with ms sql server.
  
   I have followed the tutorial and imported some rows to the Solr.
   Now I have a few noob question regarding the benefits of implementing
  Solr
   on a sql environment.
  
   1. As I understand, When I send a query request over http, I receive a
   result with ID from the Solr system and than I query the full object
 row
   from the db.
   Is that right?
   Is there a comparison  next to ms sql full text search which retrieves
  the
   full object in the same select?
   Is there a comparison that relates to db/server cluster and multiple
   machines?
   2. Is there a technic that will assist me to estimate the volume size
 I
   will need for the indexed data (obviously, based on the indexed data
   properties) ?
 



Clearing a specific index / all indice

2013-06-02 Thread Mysurf Mail
I am running solr with two cores in solr.xml
One is product (import from db) and one is collection1 (from the tutorial)

Now in order to clear the index I run

http://localhost:8983/solr/update?stream.body=deletequery*:*/query/delete


http://localhost:8983/solr/update?stream.body=commit/


only the collection1 core (of the tutorial) is cleared.

How can I clear a specific index?

How can I clear all indice?

Thanks.


word stem

2013-06-02 Thread Mysurf Mail
Using solr over my sql db I query the following

http://localhost:8983/solr/products/select?q=requirewt=xmlindent=truefl=*,score

where the queried word require is found in the index since I imported the
following:

Each frame is hand-crafted in our Bothell facility to the optimum diameter
and wall-thickness *required *of a premium mountain frame. The heat-treated
welded aluminum frame has a larger diameter tube that absorbs the bumps.

required!=require

I try it in the analysis tool in the portal for debugging and I see in the
fierld value the PST (stem) filter does make a token from the required as
requir
I write required in the debug query field and when I click on Analyse
Values I see requir is highlited.


But the http query only return values when I wuery required. not require.


Thanks.


Re: installing configuring solr over ms sql server - tutorial needed

2013-06-01 Thread Mysurf Mail
My problem was with sql server.
This http://danpincas.com/2013/03/03/searching-with-solr-part-1.html is a
great step by step


On Sat, Jun 1, 2013 at 2:06 AM, bbarani bbar...@gmail.com wrote:

 Why dont you follow this one tutorial to set the SOLR on tomcat..

 http://wiki.apache.org/solr/SolrTomcat



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/installing-configuring-solr-over-ms-sql-server-tutorial-needed-tp4067344p4067488.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Estimating the required volume to

2013-06-01 Thread Mysurf Mail
Hi,

I am just starting to learn about solr.
I want to test it in my env working with ms sql server.

I have followed the tutorial and imported some rows to the Solr.
Now I have a few noob question regarding the benefits of implementing Solr
on a sql environment.

1. As I understand, When I send a query request over http, I receive a
result with ID from the Solr system and than I query the full object row
from the db.
Is that right?
Is there a comparison  next to ms sql full text search which retrieves the
full object in the same select?
Is there a comparison that relates to db/server cluster and multiple
machines?
2. Is there a technic that will assist me to estimate the volume size I
will need for the indexed data (obviously, based on the indexed data
properties) ?


installing configuring solr over ms sql server - tutorial needed

2013-05-31 Thread Mysurf Mail
I am trying to config solr over ms sql server.
I found only this tutorial
http://www.chrisumbel.com/article/lucene_solr_sql_serverwhih
is a bit old (2011)
Is there an updated  / formal tutorial?


Re: installing configuring solr over ms sql server - tutorial needed

2013-05-31 Thread Mysurf Mail
Thanks.
A tutorial on getting solr over mssql ?
I didnt find it even with jetty



On Fri, May 31, 2013 at 6:21 PM, Alexandre Rafalovitch
arafa...@gmail.comwrote:

 You have two mostly-separate issues here. Running Solr in Tomcat and
 indexing MSSql server.

 Try just running a default embedded-Jetty example until you get data
 import sorted out. Then, you can worry about Tomcat. And it would be
 easier to help with one problem at a time.

 Regards,
Alex.
 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)


 On Fri, May 31, 2013 at 11:03 AM, Mysurf Mail stammail...@gmail.com
 wrote:
  I am trying to config solr over ms sql server.
  I found only this tutorial
  http://www.chrisumbel.com/article/lucene_solr_sql_serverwhih
  is a bit old (2011)
  Is there an updated  / formal tutorial?



Re: installing configuring solr over ms sql server - tutorial needed

2013-05-31 Thread Mysurf Mail
for instance step 5 - Download and install a SQL Server JDBC drive.
Where do I put it when using jetty?

* Just asked here a question if an official  tutorial for ms sql server
exists before I try to go through several tutorials.



On Fri, May 31, 2013 at 6:42 PM, Alexandre Rafalovitch
arafa...@gmail.comwrote:

 What's wrong with the one you found. Just ignore steps 1-4 and go
 right into driver and DIH setup. If you hit any problems, you now have
 a specific question to ask.

 Regards,
Alex.
 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)


 On Fri, May 31, 2013 at 11:29 AM, Mysurf Mail stammail...@gmail.com
 wrote:
  Thanks.
  A tutorial on getting solr over mssql ?
  I didnt find it even with jetty
 
 
 
  On Fri, May 31, 2013 at 6:21 PM, Alexandre Rafalovitch
  arafa...@gmail.comwrote:
 
  You have two mostly-separate issues here. Running Solr in Tomcat and
  indexing MSSql server.
 
  Try just running a default embedded-Jetty example until you get data
  import sorted out. Then, you can worry about Tomcat. And it would be
  easier to help with one problem at a time.
 
  Regards,
 Alex.
  Personal blog: http://blog.outerthoughts.com/
  LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
  - Time is the quality of nature that keeps events from happening all
  at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
  book)
 
 
  On Fri, May 31, 2013 at 11:03 AM, Mysurf Mail stammail...@gmail.com
  wrote:
   I am trying to config solr over ms sql server.
   I found only this tutorial
   http://www.chrisumbel.com/article/lucene_solr_sql_serverwhih
   is a bit old (2011)
   Is there an updated  / formal tutorial?
 



Re: installing configuring solr over ms sql server - tutorial needed

2013-05-31 Thread Mysurf Mail
btw: The other stages still refer to location relative to tomcat


On Sat, Jun 1, 2013 at 12:02 AM, Mysurf Mail stammail...@gmail.com wrote:

 for instance step 5 - Download and install a SQL Server JDBC drive.
 Where do I put it when using jetty?

 * Just asked here a question if an official  tutorial for ms sql server
 exists before I try to go through several tutorials.



 On Fri, May 31, 2013 at 6:42 PM, Alexandre Rafalovitch arafa...@gmail.com
  wrote:

 What's wrong with the one you found. Just ignore steps 1-4 and go
 right into driver and DIH setup. If you hit any problems, you now have
 a specific question to ask.

 Regards,
Alex.
 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)


 On Fri, May 31, 2013 at 11:29 AM, Mysurf Mail stammail...@gmail.com
 wrote:
  Thanks.
  A tutorial on getting solr over mssql ?
  I didnt find it even with jetty
 
 
 
  On Fri, May 31, 2013 at 6:21 PM, Alexandre Rafalovitch
  arafa...@gmail.comwrote:
 
  You have two mostly-separate issues here. Running Solr in Tomcat and
  indexing MSSql server.
 
  Try just running a default embedded-Jetty example until you get data
  import sorted out. Then, you can worry about Tomcat. And it would be
  easier to help with one problem at a time.
 
  Regards,
 Alex.
  Personal blog: http://blog.outerthoughts.com/
  LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
  - Time is the quality of nature that keeps events from happening all
  at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
  book)
 
 
  On Fri, May 31, 2013 at 11:03 AM, Mysurf Mail stammail...@gmail.com
  wrote:
   I am trying to config solr over ms sql server.
   I found only this tutorial
   http://www.chrisumbel.com/article/lucene_solr_sql_serverwhih
   is a bit old (2011)
   Is there an updated  / formal tutorial?
 





Re: installing configuring solr over ms sql server - tutorial needed

2013-05-31 Thread Mysurf Mail
Hi,
I am still having a problem with this
http://www.chrisumbel.com/article/lucene_solr_sql_servertutorial trying
to get solr on tomcat.
in step 4 when I copy apache-solr-1.4.0\example\solr to my tomcat dir I get
a folder with bin and collection1 folder.
Do I need them?
should I create conf under solr or under collection1?
I dont have any solrconfig or schema files under solr. only under
collection1.



On Sat, Jun 1, 2013 at 12:26 AM, bbarani bbar...@gmail.com wrote:

 solrconfig.xml - the lib directives specified in the configuration file are
 the lib locations where Solr would look for the jars.

 solr.xml - In case of the Multi core setup, you can have a sharedLib for
 all
 the collections. You can add the jdbc driver into the sharedLib folder.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/installing-configuring-solr-over-ms-sql-server-tutorial-needed-tp4067344p4067465.html
 Sent from the Solr - User mailing list archive at Nabble.com.