date:20150617

Do you have a managed-schema file, or such?

You may have used the configs that have a managed schema, i.e. one that
allows you to change the schema via HTTP.

Upayavira

On Wed, Jun 17, 2015, at 02:33 PM, TK Solr wrote:
 With Solr 5.2.0, I ran:
 bin/solr create -c foo
 This created solrconfig.xml in server/solr/foo/conf directory.
 Other configuration files such as synonyms.txt are found in this
 directory too.
 
 But I don't see schema.xml. Why is schema.xml handled differently?
 
 I am guessing
 server/solr/configsets/sample_techproducts_configs/conf/schema.xml
 is used by the foo core because it knows about the cat field.
 Is the template files in sample_techproducts_configs considered standard?
 
 TK

Re: Please help test the new Angular JS Admin UI

2015-06-17 Thread Anshum Gupta

This looks good overall and thanks for migrating it to something that more
developers can contribute to.

I started solr (trunk) in cloud mode using the bin scripts and opened the
new admin UI. The section for 'cores' says 'No cores available. Go and
create one'.
Starting Solr 5.0, we officially stated in the change log and at other
places that the only supported way to create a collection is through the
Collections API. We should move along those lines and not stray with the
new interface. I am not sure if the intention with this move is to first
migrate everything as is and then redo the design but I'd strongly suggest
that we do things the right way.

On Sun, Jun 14, 2015 at 5:53 PM, Erick Erickson erickerick...@gmail.com
wrote:

 And anyone who, you know, really likes working with UI code please
 help making it better!

 As of Solr 5.2, there is a new version of the Admin UI available, and
 several improvements are already in 5.2.1 (release imminent). The old
 admin UI is still the default, the new one is available at

 solr_ip:port/admin/index.html

 Currently, you will see very little difference at first glance; the
 goal for this release was to have as much of the current functionality
 as possible ported to establish the framework. Upayavira has done
 almost all of the work getting this in place, thanks for taking that
 initiative Upayavira!

 Anyway, the plan is several fold:
  Get as much testing on this as possible over the 5.2 time frame.
  Make the new Angular JS-based code the default in 5.3
  Make improvements/bug fixes to the admin UI on the new code line,
 particularly SolrCloud functionality.
  Deprecate the current code and remove it eventually.

 The new code should be quite a bit easier to work on for programmer
 types, and there are Big Plans Afoot for making the admin UI more
 SolrCloud-friendly. Now that the framework is in place, it should be
 easier for anyone who wants to volunteer to contribute, please do!

 So please give it a whirl. I'm sure there will be things that crop up,
 and any help addressing them will be appreciated. There's already an
 umbrella JIRA for this work, see:
 https://issues.apache.org/jira/browse/SOLR-7666. Please link any new
 issues to this JIRA so we can keep track of it all as well as
 coordinate efforts. If all goes well, this JIRA can be used to see
 what's already been reported too.

 Note that things may be moving pretty quickly, so trunk and 5x will
 always be the most current. That said looking at 5.2.1 will be much
 appreciated.

 Erick




-- 
Anshum Gupta

Re: Please help test the new Angular JS Admin UI

Thanks Ramkumar, will dig into these next week.

Upayavira

On Wed, Jun 17, 2015, at 02:08 PM, Ramkumar R. Aiyengar wrote:
 I started with an empty Solr instance and Firefox 38 on Linux. This is
 the
 trunk source..
 
 There's a 'No cores available. Go and create one' button available in the
 old and the new UI. In the old UI, clicking it goes to the core admin,
 and
 pops open the dialog for Add Core. The new UI only goes to the core
 admin.
 Also, when you then click on the Add Core, the dialog bleeds into the
 sidebar.
 
 I then started with a getting started config and a cloud of 2x2. Then
 brought up admin UI on one of them, opened up one of the cores, and
 clicked
 on the Files tab -- that showed an exception..
 
 {data:{responseHeader:{status:500,QTime:1},error:{msg:Path
 must not end with /
 character,trace:java.lang.IllegalArgumentException:
 Path must not end with / character\n\tat
 org.apache.zookeeper.common.PathUtils.validatePath(PathUtils.java:58)\n\tat
 org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1024)\n\tat
 org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:319)\n\tat
 org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:316)\n\tat
 org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61)\n\tat
 org.apache.solr.common.cloud.SolrZkClient.exists(SolrZkClient.java:316)\n\tat
 org.apache.solr.handler.admin.ShowFileRequestHandler.getAdminFileFromZooKeeper(ShowFileRequestHandler.java:324)\n\tat
 org.apache.solr.handler.admin.ShowFileRequestHandler.showFromZooKeeper(ShowFileRequestHandler.java:148)\n\tat
 org.apache.solr.handler.admin.ShowFileRequestHandler.handleRequestBody(ShowFileRequestHandler.java:135)\n\tat
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)\n\tat
 org.apache.solr.core.SolrCore.execute(SolrCore.java:2057)\n\tat
 org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:648)\n\tat
 org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:452)\n\tat
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:227)\n\tat
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:196)\n\tat
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)\n\tat
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)\n\tat
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)\n\tat
 
 
 Moving to Plugins/Stats, and then Core, and selecting the first searcher
 entry (e.g. for me, it is Searcher@3a7bd1[gettingstarted_shard1_replica1]
 main), I see stats like:
 
- searcherName:Searcher@#8203;3a7bd1[gettingstarted_shard1_replica1]
main
- reader:
ExitableDirectoryReader(#8203;UninvertingDirectoryReader(#8203;))
 
 Notice the unescaped characters there..

Re: Where is schema.xml ?

2015-06-17 Thread TK Solr


On 6/17/15, 2:35 PM, Upayavira wrote:

Do you have a managed-schema file, or such?

You may have used the configs that have a managed schema, i.e. one that
allows you to change the schema via HTTP.
I do see a file named managed-schema without .xml extension in the conf 
directory.

Its content does look like a schema.xml file.
Is this an initial content of in-memory schema, and schema API updates the 
schema dynamically?

Re: Please help test the new Angular JS Admin UI

2015-06-17 Thread Ramkumar R. Aiyengar

I started with an empty Solr instance and Firefox 38 on Linux. This is the
trunk source..

There's a 'No cores available. Go and create one' button available in the
old and the new UI. In the old UI, clicking it goes to the core admin, and
pops open the dialog for Add Core. The new UI only goes to the core admin.
Also, when you then click on the Add Core, the dialog bleeds into the
sidebar.

I then started with a getting started config and a cloud of 2x2. Then
brought up admin UI on one of them, opened up one of the cores, and clicked
on the Files tab -- that showed an exception..

{data:{responseHeader:{status:500,QTime:1},error:{msg:Path
must not end with / character,trace:java.lang.IllegalArgumentException:
Path must not end with / character\n\tat
org.apache.zookeeper.common.PathUtils.validatePath(PathUtils.java:58)\n\tat
org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1024)\n\tat
org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:319)\n\tat
org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:316)\n\tat
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61)\n\tat
org.apache.solr.common.cloud.SolrZkClient.exists(SolrZkClient.java:316)\n\tat
org.apache.solr.handler.admin.ShowFileRequestHandler.getAdminFileFromZooKeeper(ShowFileRequestHandler.java:324)\n\tat
org.apache.solr.handler.admin.ShowFileRequestHandler.showFromZooKeeper(ShowFileRequestHandler.java:148)\n\tat
org.apache.solr.handler.admin.ShowFileRequestHandler.handleRequestBody(ShowFileRequestHandler.java:135)\n\tat
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)\n\tat
org.apache.solr.core.SolrCore.execute(SolrCore.java:2057)\n\tat
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:648)\n\tat
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:452)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:227)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:196)\n\tat
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)\n\tat
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)\n\tat


Moving to Plugins/Stats, and then Core, and selecting the first searcher
entry (e.g. for me, it is Searcher@3a7bd1[gettingstarted_shard1_replica1]
main), I see stats like:

   - searcherName:Searcher@#8203;3a7bd1[gettingstarted_shard1_replica1]
   main
   - reader:
   ExitableDirectoryReader(#8203;UninvertingDirectoryReader(#8203;))

Notice the unescaped characters there..

Where is schema.xml ?

2015-06-17 Thread TK Solr

With Solr 5.2.0, I ran:
bin/solr create -c foo
This created solrconfig.xml in server/solr/foo/conf directory.
Other configuration files such as synonyms.txt are found in this directory too.

But I don't see schema.xml. Why is schema.xml handled differently?

I am guessing
server/solr/configsets/sample_techproducts_configs/conf/schema.xml
is used by the foo core because it knows about the cat field.
Is the template files in sample_techproducts_configs considered standard?

TK

RE: Please help test the new Angular JS Admin UI

2015-06-17 Thread Kim, Soonho (IFPRI)

i will check with Henry about this prolem again.

Best,
Soonho

From: Ramkumar R. Aiyengar [andyetitmo...@gmail.com]
Sent: Wednesday, June 17, 2015 5:08 PM
To: solr-user@lucene.apache.org
Subject: Re: Please help test the new Angular JS Admin UI

I started with an empty Solr instance and Firefox 38 on Linux. This is the
trunk source..

There's a 'No cores available. Go and create one' button available in the
old and the new UI. In the old UI, clicking it goes to the core admin, and
pops open the dialog for Add Core. The new UI only goes to the core admin.
Also, when you then click on the Add Core, the dialog bleeds into the
sidebar.

I then started with a getting started config and a cloud of 2x2. Then
brought up admin UI on one of them, opened up one of the cores, and clicked
on the Files tab -- that showed an exception..

{data:{responseHeader:{status:500,QTime:1},error:{msg:Path
must not end with / character,trace:java.lang.IllegalArgumentException:
Path must not end with / character\n\tat
org.apache.zookeeper.common.PathUtils.validatePath(PathUtils.java:58)\n\tat
org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1024)\n\tat
org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:319)\n\tat
org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:316)\n\tat
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61)\n\tat
org.apache.solr.common.cloud.SolrZkClient.exists(SolrZkClient.java:316)\n\tat
org.apache.solr.handler.admin.ShowFileRequestHandler.getAdminFileFromZooKeeper(ShowFileRequestHandler.java:324)\n\tat
org.apache.solr.handler.admin.ShowFileRequestHandler.showFromZooKeeper(ShowFileRequestHandler.java:148)\n\tat
org.apache.solr.handler.admin.ShowFileRequestHandler.handleRequestBody(ShowFileRequestHandler.java:135)\n\tat
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)\n\tat
org.apache.solr.core.SolrCore.execute(SolrCore.java:2057)\n\tat
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:648)\n\tat
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:452)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:227)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:196)\n\tat
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)\n\tat
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)\n\tat

Moving to Plugins/Stats, and then Core, and selecting the first searcher
entry (e.g. for me, it is Searcher@3a7bd1[gettingstarted_shard1_replica1]
main), I see stats like:

   - searcherName:Searcher@#8203;3a7bd1[gettingstarted_shard1_replica1]
   main
   - reader:
   ExitableDirectoryReader(#8203;UninvertingDirectoryReader(#8203;))

Notice the unescaped characters there..

About indexing embed file with solr

2015-06-17 Thread ??????

Hello,
  Could anyone recieve my email? I'm new to solr and I have some questions, 
could anyone help me to give me some answer??
  I index file directly by extracting the content of file using Tika 
embeded in solr. There is no problem of normal files. While I index a word 
embeded an another file, such as a pdf file embed in a word, I couldn't get the 
content of embeded file. For example, I have a word(doc) and there is a pdf 
embeded in the word(doc), I couldn't index the content of the pdf file. While 
using the same jar of Tika to extract the content of embed file, I can get the 
content of embeded file.
  I know Tika could extract the embed file since version 1.3. And the 
version of my solr is 4.9.1, Tika used in this version of solr is 1.5. I don't 
know why I can't get the content of embed file.
  Could anyone help me? Thank you very much.


   Ping Liu


 18 June. 2015

Re: Matching Queries with Wildcards and Numbers

This one's going to be confusing to explain.

The ability of filters to operate on wildcarded terms at query time is limited
to some specific filters. If you're going into the code, see
MultiTermAware-derived
filters.

Most generally, the MultiTermAware filters only are valid for filters
that do _not_
produce more than one output token for a given input token. Gibberish, I know,
but bear with me.

WordDelimiterFilterFactory is _NOT_ MultiTermAware because, you guessed it,
it can produce more than one token per input token at query time. Specifically
in your example, at index time it'll produce tokens Sidem and 2.

However, at query time for Sidem2 it will just pass the token
through complete.
And since the token is not in your index, it's not found. Hmm, I wonder what
the admin/analysis page would show here

Anyway, you probably can get what you want by changing the index time
definition of WDFF from catenateAll=0 to catenateAll=1. That will put
Sidem, 2, and Sidem2 in your index. Then the fact that query time processing
for wildcards does _not_ break things up, Sidem2 will go through at query time.
Then the doc should be found.

Of course you have to reindex your docs after the change.

Trying to allow wildcards for filters at query time that emit multiple
output tokens
per input token is an utter and complete disaster.

HTH,
Erick


On Wed, Jun 17, 2015 at 10:56 AM, Ellington Kirby
ellingtonkirb...@gmail.com wrote:
 Hi! I am a Solr user having an issue with matches on searches using the
 wildcard operators, specifically when the searches include a wildcard
 operator with a number. Here is an example.
 My query will look like (productTitle:*Sidem2*) and match nothing, when it
 should be matching the productTitle Sidem2. However, searching for Sidem
 will match the productTitle Sidem2. In addition, I have isolated it to only
 fail to match when the productTitle has a number in it, for example a query
 for (productTitle:*Cupx Collapsed*) will correctly match the product Cupx
 Collapsed. I need to use the wildcard operators around the query so that an
 auto-complete feature can be used, where if a user stops typing at a
 certain point, a search will be executed on their input so far and it will
 match the correct product titles. I have looked all over, through the
 excellent book Solr In Action by Grainger and Potter, through Stack
 Overflow and several blog posts and have not found anything on this
 specific issue. Common advice is to remove the stemmer, which I have done.
 I have also added the ReversedWildcardFilterFactory. Here is a copy of my
 schema for the specific fieldType if that is any help. Please let me know
 if anyone has any tips or clues! I am not a very experienced Solr user and
 would really appreciate any advice.


   fieldType name=text_en_splitting class=solr.TextField
 positionIncrementGap=100 autoGeneratePhraseQueries=true
 analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 !-- in this example, we will only use synonyms at query time
 filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt ignoreCase=true expand=false/
 --
 !-- Case insensitive stop word removal.
 --
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=lang/stopwords_en.txt
 /
 !-- Concatenate characters and numbers by setting catenateAll
 to 1 - this will avoid problems with alphabetical sort --
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.ReversedWildcardFilterFactory
 withOriginal=true
  maxPosAsterisk=2 maxPosQuestion=1 minTrailing=2
 maxFractionAsterisk=0/
 /analyzer
 analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory
 synonyms=synonyms.txt ignoreCase=true expand=true/
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=lang/stopwords_en.txt
 /
 !-- Concatenate characters and numbers by setting catenateAll
 to 1 - this will avoid problems with alphabetical sort --
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1
 preserveOriginal=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 /analyzer
 /fieldType


 Thank you in advance!
 --From a sincerely puzzled Solr user, Ellington Kirby

API support upload file for External File Field

2015-06-17 Thread Floyd Wu

Is there any API to support upload file for ExternalFileField to /data/
directory or any good practice on this?

My application and Solr Server were physically separated on two place.
Application will calculate a score and generate a file for
ExternalFileField.

Thanks for any input.

Re: Please help test the new Angular JS Admin UI

The intention very much is to do a collections API pane. In fact, I've
got a first pass made already that can create/delete collections, and
show the details of a collection and its replicas. But I want to focus
on getting the feature-for-feature replacement working first. If we
don't do that, then we can't make it default, then we create a divided
experience for want a working UI and want the cool new features.

A decent collections API tab really won't take that long I don't think
once we've given the new version a good shake-down.

Upayavira

On Wed, Jun 17, 2015, at 02:50 PM, Anshum Gupta wrote:
 This looks good overall and thanks for migrating it to something that
 more
 developers can contribute to.
 
 I started solr (trunk) in cloud mode using the bin scripts and opened the
 new admin UI. The section for 'cores' says 'No cores available. Go and
 create one'.
 Starting Solr 5.0, we officially stated in the change log and at other
 places that the only supported way to create a collection is through the
 Collections API. We should move along those lines and not stray with the
 new interface. I am not sure if the intention with this move is to first
 migrate everything as is and then redo the design but I'd strongly
 suggest
 that we do things the right way.
 
 On Sun, Jun 14, 2015 at 5:53 PM, Erick Erickson erickerick...@gmail.com
 wrote:
 
  And anyone who, you know, really likes working with UI code please
  help making it better!
 
  As of Solr 5.2, there is a new version of the Admin UI available, and
  several improvements are already in 5.2.1 (release imminent). The old
  admin UI is still the default, the new one is available at
 
  solr_ip:port/admin/index.html
 
  Currently, you will see very little difference at first glance; the
  goal for this release was to have as much of the current functionality
  as possible ported to establish the framework. Upayavira has done
  almost all of the work getting this in place, thanks for taking that
  initiative Upayavira!
 
  Anyway, the plan is several fold:
   Get as much testing on this as possible over the 5.2 time frame.
   Make the new Angular JS-based code the default in 5.3
   Make improvements/bug fixes to the admin UI on the new code line,
  particularly SolrCloud functionality.
   Deprecate the current code and remove it eventually.
 
  The new code should be quite a bit easier to work on for programmer
  types, and there are Big Plans Afoot for making the admin UI more
  SolrCloud-friendly. Now that the framework is in place, it should be
  easier for anyone who wants to volunteer to contribute, please do!
 
  So please give it a whirl. I'm sure there will be things that crop up,
  and any help addressing them will be appreciated. There's already an
  umbrella JIRA for this work, see:
  https://issues.apache.org/jira/browse/SOLR-7666. Please link any new
  issues to this JIRA so we can keep track of it all as well as
  coordinate efforts. If all goes well, this JIRA can be used to see
  what's already been reported too.
 
  Note that things may be moving pretty quickly, so trunk and 5x will
  always be the most current. That said looking at 5.2.1 will be much
  appreciated.
 
  Erick
 
 
 
 
 -- 
 Anshum Gupta

Re: Where is schema.xml ?



On Wed, Jun 17, 2015, at 02:49 PM, TK Solr wrote:
 On 6/17/15, 2:35 PM, Upayavira wrote:
  Do you have a managed-schema file, or such?
 
  You may have used the configs that have a managed schema, i.e. one that
  allows you to change the schema via HTTP.
 I do see a file named managed-schema without .xml extension in the
 conf 
 directory.
 Its content does look like a schema.xml file.
 Is this an initial content of in-memory schema, and schema API updates
 the 
 schema dynamically?

Yup, that's how I understand it. You should not edit that file directly.

Upayavira

Re: Please help test the new Angular JS Admin UI

This kind of feedback is _very_ valuable, many thanks to all.

I may be the one committing this, but Upayavira is doing all the work
so hats off to him.

And it's time for anyone who likes UI work to step up and contribute ;).
I'll be happy to commit changes. Just link any JIRAs (especially ones
with patches attached) to SOLR-7666 and I'll see them. Or mention me
in the new JIRA and I'll link them.

Needless to say, UI work isn't something I'm very good at

On Wed, Jun 17, 2015 at 5:55 PM, Upayavira u...@odoko.co.uk wrote:
 We can get things like this in. If you want, feel free to have a go. As
 much as I want to work on funky new stuff, I really need to focus on
 finishing stuff first.

 Upayavira

 On Wed, Jun 17, 2015, at 02:53 PM, Anshum Gupta wrote:
 Also, while you are at it, it'd be good to get SOLR-4777 in so the Admin
 UI
 is correct when users look at the SolrCloud graph post an operation that
 can leave the slice INACTIVE e.g. Shard split.

 On Wed, Jun 17, 2015 at 2:50 PM, Anshum Gupta ans...@anshumgupta.net
 wrote:

  This looks good overall and thanks for migrating it to something that more
  developers can contribute to.
 
  I started solr (trunk) in cloud mode using the bin scripts and opened the
  new admin UI. The section for 'cores' says 'No cores available. Go and
  create one'.
  Starting Solr 5.0, we officially stated in the change log and at other
  places that the only supported way to create a collection is through the
  Collections API. We should move along those lines and not stray with the
  new interface. I am not sure if the intention with this move is to first
  migrate everything as is and then redo the design but I'd strongly suggest
  that we do things the right way.
 
  On Sun, Jun 14, 2015 at 5:53 PM, Erick Erickson erickerick...@gmail.com
  wrote:
 
  And anyone who, you know, really likes working with UI code please
  help making it better!
 
  As of Solr 5.2, there is a new version of the Admin UI available, and
  several improvements are already in 5.2.1 (release imminent). The old
  admin UI is still the default, the new one is available at
 
  solr_ip:port/admin/index.html
 
  Currently, you will see very little difference at first glance; the
  goal for this release was to have as much of the current functionality
  as possible ported to establish the framework. Upayavira has done
  almost all of the work getting this in place, thanks for taking that
  initiative Upayavira!
 
  Anyway, the plan is several fold:
   Get as much testing on this as possible over the 5.2 time frame.
   Make the new Angular JS-based code the default in 5.3
   Make improvements/bug fixes to the admin UI on the new code line,
  particularly SolrCloud functionality.
   Deprecate the current code and remove it eventually.
 
  The new code should be quite a bit easier to work on for programmer
  types, and there are Big Plans Afoot for making the admin UI more
  SolrCloud-friendly. Now that the framework is in place, it should be
  easier for anyone who wants to volunteer to contribute, please do!
 
  So please give it a whirl. I'm sure there will be things that crop up,
  and any help addressing them will be appreciated. There's already an
  umbrella JIRA for this work, see:
  https://issues.apache.org/jira/browse/SOLR-7666. Please link any new
  issues to this JIRA so we can keep track of it all as well as
  coordinate efforts. If all goes well, this JIRA can be used to see
  what's already been reported too.
 
  Note that things may be moving pretty quickly, so trunk and 5x will
  always be the most current. That said looking at 5.2.1 will be much
  appreciated.
 
  Erick
 
 
 
 
  --
  Anshum Gupta
 



 --
 Anshum Gupta

Re: Please help test the new Angular JS Admin UI

We can get things like this in. If you want, feel free to have a go. As
much as I want to work on funky new stuff, I really need to focus on
finishing stuff first.

Upayavira

On Wed, Jun 17, 2015, at 02:53 PM, Anshum Gupta wrote:
 Also, while you are at it, it'd be good to get SOLR-4777 in so the Admin
 UI
 is correct when users look at the SolrCloud graph post an operation that
 can leave the slice INACTIVE e.g. Shard split.
 
 On Wed, Jun 17, 2015 at 2:50 PM, Anshum Gupta ans...@anshumgupta.net
 wrote:
 
  This looks good overall and thanks for migrating it to something that more
  developers can contribute to.
 
  I started solr (trunk) in cloud mode using the bin scripts and opened the
  new admin UI. The section for 'cores' says 'No cores available. Go and
  create one'.
  Starting Solr 5.0, we officially stated in the change log and at other
  places that the only supported way to create a collection is through the
  Collections API. We should move along those lines and not stray with the
  new interface. I am not sure if the intention with this move is to first
  migrate everything as is and then redo the design but I'd strongly suggest
  that we do things the right way.
 
  On Sun, Jun 14, 2015 at 5:53 PM, Erick Erickson erickerick...@gmail.com
  wrote:
 
  And anyone who, you know, really likes working with UI code please
  help making it better!
 
  As of Solr 5.2, there is a new version of the Admin UI available, and
  several improvements are already in 5.2.1 (release imminent). The old
  admin UI is still the default, the new one is available at
 
  solr_ip:port/admin/index.html
 
  Currently, you will see very little difference at first glance; the
  goal for this release was to have as much of the current functionality
  as possible ported to establish the framework. Upayavira has done
  almost all of the work getting this in place, thanks for taking that
  initiative Upayavira!
 
  Anyway, the plan is several fold:
   Get as much testing on this as possible over the 5.2 time frame.
   Make the new Angular JS-based code the default in 5.3
   Make improvements/bug fixes to the admin UI on the new code line,
  particularly SolrCloud functionality.
   Deprecate the current code and remove it eventually.
 
  The new code should be quite a bit easier to work on for programmer
  types, and there are Big Plans Afoot for making the admin UI more
  SolrCloud-friendly. Now that the framework is in place, it should be
  easier for anyone who wants to volunteer to contribute, please do!
 
  So please give it a whirl. I'm sure there will be things that crop up,
  and any help addressing them will be appreciated. There's already an
  umbrella JIRA for this work, see:
  https://issues.apache.org/jira/browse/SOLR-7666. Please link any new
  issues to this JIRA so we can keep track of it all as well as
  coordinate efforts. If all goes well, this JIRA can be used to see
  what's already been reported too.
 
  Note that things may be moving pretty quickly, so trunk and 5x will
  always be the most current. That said looking at 5.2.1 will be much
  appreciated.
 
  Erick
 
 
 
 
  --
  Anshum Gupta
 
 
 
 
 -- 
 Anshum Gupta

XPathentity processor on CLOB field

2015-06-17 Thread Pattabiraman, Meenakshisundaram

My requirement is to read the XML from a CLOB field and parse it to get the 
entity.

The data config is as shown below. I am trying to map two fields 'event' and 
'policyNumber' for the entity 'catreport'.


dataSource name=mbdev driver=oracle.jdbc.driver.OracleDriver 
url=jdbc:oracle:thin:@localhost:1521:orcl user=xyz password=xyz/
document name=insight
entity name=input query=select * from test logLevel=debug 
datasource=mbdev transformer=ClobTransformer, script:toDate
field column=LOAD_DATE name=load_date /

field column=RESPONSE_XML name=RESPONSE_XML clob=true /
dataSource name=xmldata type=FieldReaderDataSource/

entity name=catReport dataSource=xmldata 
dataField=input.RESPONSE_XML processor=XPathEntityProcessor  
forEach=/*:DecisionServiceRs  rootEntity=true logLevel=debug
field column=event 
xpath=/dec:DecisionServiceRs/@event/


I am getting this error


Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: 
Unable to execute query: null Processing Document # 1
at 
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:70)
at 
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.init(JdbcDataSource.java:321)
at 
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:278)
at 
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:53)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:283)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:224)

I see that the Clob is getting converted to String correctly and the log has 
this entry where xml is printed
Exception while processing: input document : SolrInputDocument(fields: 
[RESPONSE_XML=dec:Deci


I do not know why the error is thrown at Jdbc when the Clob is converted to 
string and passed to the FieldReader and do not know how to make this work.

Thanks
Pattabi

Re: Solr's suggester results

2015-06-17 Thread Zheng Lin Edwin Yeo

I'm using the FreeTextLookupFactory in my implementation now.

Yes, now it can suggest part of the field from the middle of the content.

I read that this implementation is able to consider the previous tokens
when making the suggestions. However, when I try to enter a search phrase,
it seems that it is only considering the last token and not any of the
previous tokens.

For example, when I search for
http://localhost:8983/edm/collection1/suggest?suggest.q=trouble free, it is
giving me suggestions based on the word 'free' only, and not 'trouble free'.

This is my configuration:

In solrconfig.xml:

searchComponent name=suggest class=solr.SuggestComponent
lst name=suggester

str name=lookupImplFreeTextLookupFactory/str
str name=indexPathsuggester_freetext_dir/str
str name=dictionaryImplDocumentDictionaryFactory/str
str name=fieldSuggestion/str
str name=suggestFreeTextAnalyzerFieldTypesuggestType/str
str name=ngrams5/str
str name=buildOnStartupfalse/str
str name=buildOnCommitfalse/str
/lst
/searchComponent

requestHandler name=/suggest class=solr.SearchHandler startup=lazy
lst name=defaults
str name=wtjson/str
str name=indenttrue/str

str name=suggesttrue/str
str name=suggest.count10/str
str name=suggest.dictionarymySuggester/str
/lst
arr name=components
strsuggest/str
/arr
/requestHandler

In schema.xml

fieldType name=suggestType class=solr.TextField
positionIncrementGap=100
analyzer
charFilter class=solr.PatternReplaceCharFilterFactory
pattern=[^a-zA-Z0-9] replacement= /
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.ShingleFilterFactory maxShingleSize=5
outputUnigrams=true/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt /
/analyzer
/fieldType

Is there anything I configured wrongly? I've set the ngrams to 5, which
means it is supposed to consider up to the previous 5 tokens entered?

Regards,
Edwin

On 17 June 2015 at 22:12, Alessandro Benedetti benedetti.ale...@gmail.com
wrote:

Edwin,
The spellcheck is a thing, the Suggester is another.

If you need to provide auto suggestion to your users, the suggester is the
right thing to use.
But I really doubt to be useful to select as a suggester field the entire
content.
it is going to be quite expensive.

In the case I would again really suggest you to take a look to the article
I quoted and Solr generic documentation.

It is possible to suggest part of the field.
You can use the FreeText suggester with a proper analysis selected.

Cheers

2015-06-17 6:14 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com:

Yes I've looked at that before, but I was told that the newer version of
Solr has its own suggester, and does not need to use spellchecker
anymore?

So it's not necessary to use the spellechecker inside suggester anymore?

Regards,
Edwin

On 17 June 2015 at 11:56, Erick Erickson erickerick...@gmail.com
wrote:

Have you looked at spellchecker? Because that sound much more like
what you're asking about than suggester.

Spell checking is more what you're asking for, have you even looked at
that
after it was suggested?

bq: Also, when I do a search, it shouldn't be returning whole fields,
but just to return a portion of the sentence

This is what highlighting is built for.

Really, I recommend you take the time to do some familiarization with
the
whole search space and Solr. The excellent book here:

http://www.amazon.com/Solr-Action-Trey-Grainger/dp/1617291021/ref=sr_1_1?ie=UTF8qid=1434513284sr=8-1keywords=apache+solrpebp=1434513287267perid=0YRK508J0HJ1N3BAX20E

will give you the grounding you need to get the most out of Solr.

Best,
Erick

On Tue, Jun 16, 2015 at 8:27 PM, Zheng Lin Edwin Yeo
edwinye...@gmail.com wrote:
The long content is from when I tried to index PDF files. As some PDF
files
has alot of words in the content, it will lead to the *UTF8 encoding
is
longer than the max length 32766 error.*

I think the problem is the content size of the PDF file exceed 32766
characters?

I'm trying to accomplish to be able to index documents that can be of
any
size (even those with very large contents), and build the suggester
from
there. Also, when I do a search, it shouldn't be returning whole
fields,
but just to return a portion of the sentence.

Regards,
Edwin

On 16 June 2015 at 23:02, Erick Erickson erickerick...@gmail.com
wrote:

The suggesters are built to return whole fields. You _might_
be able to add multiple fragments to a multiValued
entry and get fragments, I haven't tried that though
and I suspect that actually you'd get the same thing..

This is an XY problem IMO. Please describe exactly what
you're trying to accomplish, with examples rather than
continue to pursue this path. It sounds like you want
spellcheck or

Re: adding XML data to SOLR index using DIH (xml-data-config)

There's no a-priori reason you should need to do this. What's your
evidence here? What
behaviors do you see when you try this? Details matter as Hoss would
say. Give us an
example of what changes in the XML file (and/or schema) you see that
you think require
re-indexing.

Of course if you're adding new fields to schema.xml you need to reload
(or restart) Solr.

Best,
Erick

On Wed, Jun 17, 2015 at 12:06 PM, Morris, Paul E. pmor...@nsf.gov wrote:
 We regularly create a SOLR index from XML files, using the DIH with a 
 suitably edited xml-data-config.xml. However, whenever new XML become 
 available it seems like we have to rebuild the entire index again using the 
 Data Import Handler. Are we missing something? Should it be possible to add 
 new XML to the index using /dataimport with delta-import selected?

 Many thanks if anyone has been able to add new XML files to the SOLR index 
 without reindexing everything again.

 Paul

Re: QueryParser to translate query arguments

2015-06-17 Thread Mikhail Khludnev

On Wed, Jun 17, 2015 at 2:44 PM, Sreekant Sreedharan sreeka...@alamy.com
wrote:

I have a requirement to make SOLR a turnkey replacement for our legacy
search
engine. To do this, the queries supported by the legacy search engine has
to
be supported by SOLR.

To do this, I have implemented a QueryParser. I've implemented it several
ways:

http://lucene.apache.org/solr/4_2_1/solr-solrj/org/apache/solr/common/params/ModifiableSolrParams.html#add%28java.lang.String,%20java.lang.String...%29

But in some cases,
we need to support multiple field restrictions. I would have preferred this
solution because I imagine that leveraging SOLR's robust query parsing
mechanism is more easier than building a Lucene Query from scratch.

2. The second approach, uses a BooleanQuery and attempts to construct the
entire query from the query parameters. This approach seemed more
promising,
and works for most field restrictions. But I hit a road block. The filter
seems to work for all string fields. But when I declare a field as an
integer field in my schema.xml config file, the search does not return the
very same documents. I am not sure why?

Integers are encoded differently rather than we print them, it's done via
calling FieldType through Analyzer QueryBuilder.createFieldQuery(Analyzer,
Occur, String, String, boolean, int) it's a way longer journey.

I was wondering what the best approach to this problem is (either 1 or 2
above, or something even better). And I was wondering how to fix the
problem
in each of the above cases.

--
View this message in context:
http://lucene.472066.n3.nabble.com/QueryParser-to-translate-query-arguments-tp4212394.html
Sent from the Solr - User mailing list archive at Nabble.com.

--
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
mkhlud...@griddynamics.com

Re: How to create concatenated token

2015-06-17 Thread Aman Tandon

Dear Erick,

e.g. Solr training
 *Porter:-*  solr  train
   Position 1 2
 *Concatenated :-*   solr  train
solrtrain
Position 1  2


I did implemented the filter as per my requirement. Thank you so much for
your help and guidance. So how could I contribute it to the solr.

With Regards
Aman Tandon

On Wed, Jun 17, 2015 at 10:14 AM, Aman Tandon amantandon...@gmail.com
wrote:

 Hi Erick,

 Thank you so much, it will be helpful for me to learn how to save the
 state of token. I has no idea of how to save state of previous tokens due
 to this it was difficult to generate a concatenated token in the last.

 So is there anything should I read to learn more about it.

 With Regards
 Aman Tandon

 On Wed, Jun 17, 2015 at 9:20 AM, Erick Erickson erickerick...@gmail.com
 wrote:

 I really question the premise, but have a look at:
 https://issues.apache.org/jira/browse/SOLR-7193

 Note that this is not committed and I haven't reviewed
 it so I don't have anything to say about that. And you'd
 have to implement it as a custom Filter.

 Best,
 Erick

 On Tue, Jun 16, 2015 at 5:55 PM, Aman Tandon amantandon...@gmail.com
 wrote:
  Hi,
 
  Any guesses, how could I achieve this behaviour.
 
  With Regards
  Aman Tandon
 
  On Tue, Jun 16, 2015 at 8:15 PM, Aman Tandon amantandon...@gmail.com
  wrote:
 
  e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr
 training)
 
 
  typo error
  e.g. Intent for solr training: fq=id:(234 456 545) title:(solr
 training)
 
  With Regards
  Aman Tandon
 
  On Tue, Jun 16, 2015 at 8:13 PM, Aman Tandon amantandon...@gmail.com
  wrote:
 
  We has some business logic to search the user query in user intent
 or
  finding the exact matching products.
 
  e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr
 training)
 
  As we can see it is phrase query so it will took more time than the
  single stemmed token query. There are also 5-7 words phrase query. So
 we
  want to reduce the search time by implementing this feature.
 
  With Regards
  Aman Tandon
 
  On Tue, Jun 16, 2015 at 6:42 PM, Alessandro Benedetti 
  benedetti.ale...@gmail.com wrote:
 
  Can I ask you why you need to concatenate the tokens ? Maybe we can
 find
  a
  better solution to concat all the tokens in one single big token .
  I find it difficult to understand the reasons behind tokenising,
 token
  filtering and then un-tokenizing again :)
  It would be great if you explain a little bit better what you would
 like
  to
  do !
 
 
  Cheers
 
  2015-06-16 13:26 GMT+01:00 Aman Tandon amantandon...@gmail.com:
 
   Hi,
  
   I have a requirement to create the concatenated token of all the
 tokens
   created from the last item of my analyzer chain.
  
   *Suppose my analyzer chain is :*
  
  
  
  
  
   * tokenizer class=solr.WhitespaceTokenizerFactory /  filter
   class=solr.WordDelimiterFilterFactory catenateAll=1
  splitOnNumerics=1
   preserveOriginal=1/filter
 class=solr.EdgeNGramFilterFactory
   minGramSize=2 maxGramSize=15 side=front /filter
   class=solr.PorterStemmerFilterFactory/*
   I want to create a concatenated token plugin to add at concatenated
  token
   along with the last token.
  
   e.g. Solr training
  
   *Porter:-*  solr  train
 Position 1 2
  
   *Concatenated :-*   solr  train
  solrtrain
  Position 1  2
  
   Please help me out. How to create custom filter for this
 requirement.
  
   With Regards
   Aman Tandon
  
 
 
 
  --
  --
 
  Benedetti Alessandro
  Visiting card : http://about.me/alessandro_benedetti
 
  Tyger, tyger burning bright
  In the forests of the night,
  What immortal hand or eye
  Could frame thy fearful symmetry?
 
  William Blake - Songs of Experience -1794 England

SolrCloud Docker environment

2015-06-17 Thread Vincenzo D'Amore

Hi all,

maybe someone could be interested in this.

I have created a suite of Docker images, dockerfiles and bash scripts
useful to deploy a Zookeeper ensemble with 3 or more instances and a
SolrCloud (v. 4 or 5) cluster. SolrCloud 4 cluster is based on Tomcat 7.

https://github.com/freedev/solrcloud-zookeeper-docker

This could be interesting for applications that needs a zookeeper/solrcloud
cluster.

Best regards,
Vincenzo


-- 
Vincenzo D'Amore
email: v.dam...@gmail.com
skype: free.dev
mobile: +39 349 8513251

QueryParser to translate query arguments

2015-06-17 Thread Sreekant Sreedharan

I have a requirement to make SOLR a turnkey replacement for our legacy search
engine. To do this, the queries supported by the legacy search engine has to
be supported by SOLR.

To do this, I have implemented a QueryParser. I've implemented it several
ways:

1. I've copied the implementation in LuceneQParser, that uses the
SolrQueryParser, and essentially replaces the params of my QParser replacing
it with the an instance of the ModifiableSolrParams object. Taking care to
copy what exists in the previous params object and replacing the 'fq'
argument that is mapped from the query argument supported by the legacy
search engine. The problem with this approach is that ModifiableSolrParams
does not allow you to have multiple fq arguments in it. But in some cases,
we need to support multiple field restrictions. I would have preferred this
solution because I imagine that leveraging SOLR's robust query parsing
mechanism is more easier than building a Lucene Query from scratch.

2. The second approach, uses a BooleanQuery and attempts to construct the
entire query from the query parameters. This approach seemed more promising,
and works for most field restrictions. But I hit a road block. The filter
seems to work for all string fields. But when I declare a field as an
integer field in my schema.xml config file, the search does not return the
very same documents. I am not sure why?

I was wondering what the best approach to this problem is (either 1 or 2
above, or something even better). And I was wondering how to fix the problem
in each of the above cases.

--
View this message in context:
http://lucene.472066.n3.nabble.com/QueryParser-to-translate-query-arguments-tp4212394.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr/lucene index merge and optimize performance improvement

2015-06-17 Thread Toke Eskildsen

On Tue, 2015-06-16 at 09:54 -0700, Shenghua(Daniel) Wan wrote:
 Hi, Toke,
 Did you try MapReduce with solr? I think it should be a good fit for your
 use case.

Thanks for the suggestion. Improved logistics, such as starting build of
a new shard while the previous shard is optimizing, would work for us. 
Switching to a new controlling layer is not trivial, so the win by
better utilization during the optimization phase is not enough in itself
to pay the cost.

- Toke Eskildsen, State and University Library, Denmark

Re: Indexing search list of Key/Value pairs

2015-06-17 Thread amid

Hi,

Found the best way to do it (for the ones which will read it in the future).
Starting from Solr 4.8 nested documents can be used so for the document we
can created child document with the key  value as fields for each ley,
using block join queries will close to loop and give the ability to search
document with a nested document matching the query.

Hope this will help.
Thanks,
Ami



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-search-list-of-Key-Value-pairs-tp4156206p4212357.html
Sent from the Solr - User mailing list archive at Nabble.com.

Multivalued fields order of storing is guaranteed ?

2015-06-17 Thread Alok Bhandari

Hello ,

I am using Solr 5.10 , I have a use case to fit in.

Lets say I define 2 fields group-name,group-id both multivalued and stored .
1)now I add following values to each of them 
group-name {a,b,c} and group-id{1,2,3} . 

2)Now I want to add new value to each of these 2 fields {d},{4} , my
requirement is that it should add these new values such that when I query
these 2 fields it should return me {a,b,c,d,} and {1,2,3,4} in this order
i.e a=1,d=4. 

Is it guaranteed that stored multivalued fields maintain order of insertion.
Or I need to to explicitly handle this scenario.
Any help is appreciated.

Thanks,
Alok






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Multivalued-fields-order-of-storing-is-guaranteed-tp4212383.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to add a new child to existing document?

2015-06-17 Thread Mikhail Khludnev

It doesn't work by design, remove write whole block.
https://issues.apache.org/jira/browse/SOLR-6596


On Wed, Jun 17, 2015 at 11:44 AM, Maya G maiki...@gmail.com wrote:

 Hey,

 I'm trying to add a new child to an existing document.
 When I query for the child doc it doesn't return it .

 I'm using sole 4.105.

 Thank you,
 Maya



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/How-to-add-a-new-child-to-existing-document-tp4212365.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
mkhlud...@griddynamics.com

Dedupe in a SolrCloud

2015-06-17 Thread Markus.Mirsberger


Hi,

I am trying to use the dedupe feature to detect and mark near duplicate 
content in my collections.
I dont want to prevent duplicate content. I woud like to detect it and 
keep it for further processing. Thats why Im using an extra field and 
not the documents unique field.


Here is how I added it to the solrConfig.xml :

 requestHandler name=/update class=solr.UpdateRequestHandler
   lst name=defaults
 str name=update.chainfill_signature/str
   /lst
 /requestHandler

 updateRequestProcessorChain name=fill_signature 
processor=signature

processor class=solr.RunUpdateProcessorFactory /
 /updateRequestProcessorChain

 updateProcessor 
class=solr.processor.SignatureUpdateProcessorFactory name=signature

 bool name=enabledtrue/bool
 str name=signatureFieldsignature/str
 bool name=overwriteDupesfalse/bool
 str name=fieldscontent/str
 str 
name=signatureClasssolr.processor.TextProfileSignature/str

 str name=quantRate.2/str
 str name=minTokenLen3/str
 /updateProcessor

When I initially add the documents to the cloud everything works as 
expected . the documents are added and the signature will be created 
and added.perfect:)
The problem occours when I want to update an exisiting document. In that 
case the update.chain=fill_signature parameter will of course be set too 
and I get a bad request error.


I found this solr issue: https://issues.apache.org/jira/browse/SOLR-3473

Is it that problem I am running into?
Is it somehow possible to add parameters or set a specific update 
Handler when Im adding documents to the cloud using solrJ?
In that case I could ether set the update.chain manually and remove it 
from the request handler or write a second request Handler which I only 
use if I want set the signature field.
I know I can do that manually when Im using eg curl but is it also 
possible with SolrJ? :)



Thanks,
Markus

ZooKeeper connection refused

2015-06-17 Thread shacky

Hi.
I have a SolrCloud cluster with 3 nodes Solr + Zookeeper.

My solr.in.sh file is configured as following:
ZK_HOST=zk1,zk2,zk3

All worked good but now I cannot start SOLR nodes and the command exit
with the following errors:

root@index1:~# service solr restart
Sending stop command to Solr running on port 8983 ... waiting 5
seconds to allow Jetty process 32087 to stop gracefully.
Waiting to see Solr listening on port 8983 [\]  Still not seeing Solr
listening on 8983 after 30 seconds!
WARN  - 2015-06-17 10:18:37.158; [   ]
org.apache.zookeeper.ClientCnxn$SendThread; Session 0x0 for server
null, unexpected error, closing socket connection and attempting
reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
WARN  - 2015-06-17 10:18:37.823; [   ]
org.apache.zookeeper.ClientCnxn$SendThread; Session 0x0 for server
null, unexpected error, closing socket connection and attempting
reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
WARN  - 2015-06-17 10:18:38.990; [   ]
org.apache.zookeeper.ClientCnxn$SendThread; Session 0x0 for server
null, unexpected error, closing socket connection and attempting
reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
WARN  - 2015-06-17 10:18:40.543; [   ]
org.apache.zookeeper.ClientCnxn$SendThread; Session 0x0 for server
null, unexpected error, closing socket connection and attempting
reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
WARN  - 2015-06-17 10:18:42.174; [   ]
org.apache.zookeeper.ClientCnxn$SendThread; Session 0x0 for server
null, unexpected error, closing socket connection and attempting
reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)

I can telnet to the ZooKeeper port:

root@index1:~# telnet zk1 2181
Trying 192.168.70.31...
Connected to index1.dc.my.network.
Escape character is '^]'.

Could you help me please?

Thank you very much!
Bye

How to add a new child to existing document?

2015-06-17 Thread Maya G

Hey, 

I'm trying to add a new child to an existing document.
When I query for the child doc it doesn't return it .

I'm using sole 4.105.

Thank you,
Maya



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-add-a-new-child-to-existing-document-tp4212365.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: suggester returning stems instead of whole words

ah looks like I need to use copyField to get a non stemmed version of the
suggester field

Alistair

-- 
mov eax,1
mov ebx,0
int 80h




On 17/06/2015 11:15, Alistair Young alistair.yo...@uhi.ac.uk wrote:

I was wondering if there's a way to get the suggester to return whole
words. Instead of returning 'technology' , 'temperature' and 'tutorial',
it's returning 'technolog' , 'temperatur' and 'tutori'

using this config:

searchComponent class=solr.SpellCheckComponent name=suggest
lst name=spellchecker
  str name=namesuggest/str
  str 
name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str 
name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory/
str
  str name=fielddc.subject/str
  float name=threshold0.005/float
  str name=buildOnCommittrue/str
/lst
  /searchComponent
  requestHandler class=org.apache.solr.handler.component.SearchHandler
name=/suggest
lst name=defaults
  str name=spellchecktrue/str
  str name=spellcheck.dictionarysuggest/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.count10/str
  str name=spellcheck.collatetrue/str
/lst
arr name=components
  strsuggest/str
/arr
  /requestHandler

thanks,

Alistair

--
mov eax,1
mov ebx,0
int 80h

suggester returning stems instead of whole words

I was wondering if there's a way to get the suggester to return whole words. 
Instead of returning 'technology' , 'temperature' and 'tutorial', it's 
returning 'technolog' , 'temperatur' and 'tutori'

using this config:

searchComponent class=solr.SpellCheckComponent name=suggest
lst name=spellchecker
  str name=namesuggest/str
  str name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str 
name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory/str
  str name=fielddc.subject/str
  float name=threshold0.005/float
  str name=buildOnCommittrue/str
/lst
  /searchComponent
  requestHandler class=org.apache.solr.handler.component.SearchHandler 
name=/suggest
lst name=defaults
  str name=spellchecktrue/str
  str name=spellcheck.dictionarysuggest/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.count10/str
  str name=spellcheck.collatetrue/str
/lst
arr name=components
  strsuggest/str
/arr
  /requestHandler

thanks,

Alistair

--
mov eax,1
mov ebx,0
int 80h

Re: ZooKeeper connection refused

You are asking telnet to connect to zk1 on port 2181 but you have not
specified the port to Solr. You should set
ZK_HOST=zk1:2181,zk2:2181,zk3:2181 instead.

On Wed, Jun 17, 2015 at 3:53 PM, shacky shack...@gmail.com wrote:
 Hi.
 I have a SolrCloud cluster with 3 nodes Solr + Zookeeper.

 My solr.in.sh file is configured as following:
 ZK_HOST=zk1,zk2,zk3

 All worked good but now I cannot start SOLR nodes and the command exit
 with the following errors:

 root@index1:~# service solr restart
 Sending stop command to Solr running on port 8983 ... waiting 5
 seconds to allow Jetty process 32087 to stop gracefully.
 Waiting to see Solr listening on port 8983 [\]  Still not seeing Solr
 listening on 8983 after 30 seconds!
 WARN  - 2015-06-17 10:18:37.158; [   ]
 org.apache.zookeeper.ClientCnxn$SendThread; Session 0x0 for server
 null, unexpected error, closing socket connection and attempting
 reconnect
 java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
 at 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
 at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
 WARN  - 2015-06-17 10:18:37.823; [   ]
 org.apache.zookeeper.ClientCnxn$SendThread; Session 0x0 for server
 null, unexpected error, closing socket connection and attempting
 reconnect
 java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
 at 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
 at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
 WARN  - 2015-06-17 10:18:38.990; [   ]
 org.apache.zookeeper.ClientCnxn$SendThread; Session 0x0 for server
 null, unexpected error, closing socket connection and attempting
 reconnect
 java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
 at 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
 at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
 WARN  - 2015-06-17 10:18:40.543; [   ]
 org.apache.zookeeper.ClientCnxn$SendThread; Session 0x0 for server
 null, unexpected error, closing socket connection and attempting
 reconnect
 java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
 at 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
 at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
 WARN  - 2015-06-17 10:18:42.174; [   ]
 org.apache.zookeeper.ClientCnxn$SendThread; Session 0x0 for server
 null, unexpected error, closing socket connection and attempting
 reconnect
 java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
 at 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
 at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)

 I can telnet to the ZooKeeper port:

 root@index1:~# telnet zk1 2181
 Trying 192.168.70.31...
 Connected to index1.dc.my.network.
 Escape character is '^]'.

 Could you help me please?

 Thank you very much!
 Bye



-- 
Regards,
Shalin Shekhar Mangar.

Re: suggester returning stems instead of whole words

Did you change the SpellCheckComponent's configuration to use
subject_autocomplete instead of dc.subject? After you made that
change, did you invoke spellcheck.build=true to re-build the
spellcheck index?

On Wed, Jun 17, 2015 at 7:06 PM, Alistair Young
alistair.yo...@uhi.ac.uk wrote:
 copyField doesn¹t seem to fix the suggestion stemming. Copying the field
 to another field of this type:

 field name=subject_autocomplete type=text_auto indexed=true
 stored=true multiValued=false /

 copyField source=dc.subject dest=subject_autocomplete /


 fieldType class=solr.TextField name=text_auto
 positionIncrementGap=100
  analyzer
   tokenizer class=solr.WhitespaceTokenizerFactory/
   filter class=solr.WordDelimiterFilterFactory generateWordParts=1
 generateNumberParts=1 catenateWords=1 catenateNumbers=1
 catenateAll=0 splitOnCaseChange=1/
   filter class=solr.LowerCaseFilterFactory/
  /analyzer
 /fieldType


 but I¹m still getting stemmed suggestions after rebuilding the index.

 Alistair

 --
 mov eax,1
 mov ebx,0
 int 80h




 On 17/06/2015 11:28, Alistair Young alistair.yo...@uhi.ac.uk wrote:

ah looks like I need to use copyField to get a non stemmed version of the
suggester field

Alistair

--
mov eax,1
mov ebx,0
int 80h




On 17/06/2015 11:15, Alistair Young alistair.yo...@uhi.ac.uk wrote:

I was wondering if there's a way to get the suggester to return whole
words. Instead of returning 'technology' , 'temperature' and 'tutorial',
it's returning 'technolog' , 'temperatur' and 'tutori'

using this config:

searchComponent class=solr.SpellCheckComponent name=suggest
lst name=spellchecker
  str name=namesuggest/str
  str
name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str
name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory
/
str
  str name=fielddc.subject/str
  float name=threshold0.005/float
  str name=buildOnCommittrue/str
/lst
  /searchComponent
  requestHandler class=org.apache.solr.handler.component.SearchHandler
name=/suggest
lst name=defaults
  str name=spellchecktrue/str
  str name=spellcheck.dictionarysuggest/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.count10/str
  str name=spellcheck.collatetrue/str
/lst
arr name=components
  strsuggest/str
/arr
  /requestHandler

thanks,

Alistair

--
mov eax,1
mov ebx,0
int 80h





-- 
Regards,
Shalin Shekhar Mangar.

Re: suggester returning stems instead of whole words

copyField doesn¹t seem to fix the suggestion stemming. Copying the field
to another field of this type:

field name=subject_autocomplete type=text_auto indexed=true
stored=true multiValued=false /

copyField source=dc.subject dest=subject_autocomplete /


fieldType class=solr.TextField name=text_auto
positionIncrementGap=100
 analyzer
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=1 catenateNumbers=1
catenateAll=0 splitOnCaseChange=1/
  filter class=solr.LowerCaseFilterFactory/
 /analyzer
/fieldType


but I¹m still getting stemmed suggestions after rebuilding the index.

Alistair

-- 
mov eax,1
mov ebx,0
int 80h




On 17/06/2015 11:28, Alistair Young alistair.yo...@uhi.ac.uk wrote:

ah looks like I need to use copyField to get a non stemmed version of the
suggester field

Alistair

-- 
mov eax,1
mov ebx,0
int 80h




On 17/06/2015 11:15, Alistair Young alistair.yo...@uhi.ac.uk wrote:

I was wondering if there's a way to get the suggester to return whole
words. Instead of returning 'technology' , 'temperature' and 'tutorial',
it's returning 'technolog' , 'temperatur' and 'tutori'

using this config:

searchComponent class=solr.SpellCheckComponent name=suggest
lst name=spellchecker
  str name=namesuggest/str
  str 
name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str 
name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory
/
str
  str name=fielddc.subject/str
  float name=threshold0.005/float
  str name=buildOnCommittrue/str
/lst
  /searchComponent
  requestHandler class=org.apache.solr.handler.component.SearchHandler
name=/suggest
lst name=defaults
  str name=spellchecktrue/str
  str name=spellcheck.dictionarysuggest/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.count10/str
  str name=spellcheck.collatetrue/str
/lst
arr name=components
  strsuggest/str
/arr
  /requestHandler

thanks,

Alistair

--
mov eax,1
mov ebx,0
int 80h

Re: How to create concatenated token

If you used the JIRA I linked, vote for it, add any improvements etc.
Anyone can attach a patch to a JIRA, you just have to create a login.

That said, this may be too rare a use-case to deal with. I just thought
of shingling which I should have suggested before that will work for
concatenating small numbers of tokens which, I'd guess, is the case
here. I mean do you really want to concatenate 50 tokens?

Best,
Erick

On Wed, Jun 17, 2015 at 12:07 AM, Aman Tandon amantandon...@gmail.com wrote:
 Dear Erick,

 e.g. Solr training
 *Porter:-*  solr  train
   Position 1 2
 *Concatenated :-*   solr  train
solrtrain
Position 1  2


 I did implemented the filter as per my requirement. Thank you so much for
 your help and guidance. So how could I contribute it to the solr.

 With Regards
 Aman Tandon

 On Wed, Jun 17, 2015 at 10:14 AM, Aman Tandon amantandon...@gmail.com
 wrote:

 Hi Erick,

 Thank you so much, it will be helpful for me to learn how to save the
 state of token. I has no idea of how to save state of previous tokens due
 to this it was difficult to generate a concatenated token in the last.

 So is there anything should I read to learn more about it.

 With Regards
 Aman Tandon

 On Wed, Jun 17, 2015 at 9:20 AM, Erick Erickson erickerick...@gmail.com
 wrote:

 I really question the premise, but have a look at:
 https://issues.apache.org/jira/browse/SOLR-7193

 Note that this is not committed and I haven't reviewed
 it so I don't have anything to say about that. And you'd
 have to implement it as a custom Filter.

 Best,
 Erick

 On Tue, Jun 16, 2015 at 5:55 PM, Aman Tandon amantandon...@gmail.com
 wrote:
  Hi,
 
  Any guesses, how could I achieve this behaviour.
 
  With Regards
  Aman Tandon
 
  On Tue, Jun 16, 2015 at 8:15 PM, Aman Tandon amantandon...@gmail.com
  wrote:
 
  e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr
 training)
 
 
  typo error
  e.g. Intent for solr training: fq=id:(234 456 545) title:(solr
 training)
 
  With Regards
  Aman Tandon
 
  On Tue, Jun 16, 2015 at 8:13 PM, Aman Tandon amantandon...@gmail.com
  wrote:
 
  We has some business logic to search the user query in user intent
 or
  finding the exact matching products.
 
  e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr
 training)
 
  As we can see it is phrase query so it will took more time than the
  single stemmed token query. There are also 5-7 words phrase query. So
 we
  want to reduce the search time by implementing this feature.
 
  With Regards
  Aman Tandon
 
  On Tue, Jun 16, 2015 at 6:42 PM, Alessandro Benedetti 
  benedetti.ale...@gmail.com wrote:
 
  Can I ask you why you need to concatenate the tokens ? Maybe we can
 find
  a
  better solution to concat all the tokens in one single big token .
  I find it difficult to understand the reasons behind tokenising,
 token
  filtering and then un-tokenizing again :)
  It would be great if you explain a little bit better what you would
 like
  to
  do !
 
 
  Cheers
 
  2015-06-16 13:26 GMT+01:00 Aman Tandon amantandon...@gmail.com:
 
   Hi,
  
   I have a requirement to create the concatenated token of all the
 tokens
   created from the last item of my analyzer chain.
  
   *Suppose my analyzer chain is :*
  
  
  
  
  
   * tokenizer class=solr.WhitespaceTokenizerFactory /  filter
   class=solr.WordDelimiterFilterFactory catenateAll=1
  splitOnNumerics=1
   preserveOriginal=1/filter
 class=solr.EdgeNGramFilterFactory
   minGramSize=2 maxGramSize=15 side=front /filter
   class=solr.PorterStemmerFilterFactory/*
   I want to create a concatenated token plugin to add at concatenated
  token
   along with the last token.
  
   e.g. Solr training
  
   *Porter:-*  solr  train
 Position 1 2
  
   *Concatenated :-*   solr  train
  solrtrain
  Position 1  2
  
   Please help me out. How to create custom filter for this
 requirement.
  
   With Regards
   Aman Tandon
  
 
 
 
  --
  --
 
  Benedetti Alessandro
  Visiting card : http://about.me/alessandro_benedetti
 
  Tyger, tyger burning bright
  In the forests of the night,
  What immortal hand or eye
  Could frame thy fearful symmetry?
 
  William Blake - Songs of Experience -1794 England

Re: suggester returning stems instead of whole words

yep did both of those things. Getting the same results as using dc.subject

On 17/06/2015 14:44, Shalin Shekhar Mangar shalinman...@gmail.com
wrote:

Did you change the SpellCheckComponent's configuration to use
subject_autocomplete instead of dc.subject? After you made that
change, did you invoke spellcheck.build=true to re-build the
spellcheck index?

On Wed, Jun 17, 2015 at 7:06 PM, Alistair Young
alistair.yo...@uhi.ac.uk wrote:
 copyField doesn¹t seem to fix the suggestion stemming. Copying the field
 to another field of this type:

 field name=subject_autocomplete type=text_auto indexed=true
 stored=true multiValued=false /

 copyField source=dc.subject dest=subject_autocomplete /


 fieldType class=solr.TextField name=text_auto
 positionIncrementGap=100
  analyzer
   tokenizer class=solr.WhitespaceTokenizerFactory/
   filter class=solr.WordDelimiterFilterFactory generateWordParts=1
 generateNumberParts=1 catenateWords=1 catenateNumbers=1
 catenateAll=0 splitOnCaseChange=1/
   filter class=solr.LowerCaseFilterFactory/
  /analyzer
 /fieldType


 but I¹m still getting stemmed suggestions after rebuilding the index.

 Alistair

 --
 mov eax,1
 mov ebx,0
 int 80h




 On 17/06/2015 11:28, Alistair Young alistair.yo...@uhi.ac.uk wrote:

ah looks like I need to use copyField to get a non stemmed version of
the
suggester field

Alistair

--
mov eax,1
mov ebx,0
int 80h




On 17/06/2015 11:15, Alistair Young alistair.yo...@uhi.ac.uk wrote:

I was wondering if there's a way to get the suggester to return whole
words. Instead of returning 'technology' , 'temperature' and
'tutorial',
it's returning 'technolog' , 'temperatur' and 'tutori'

using this config:

searchComponent class=solr.SpellCheckComponent name=suggest
lst name=spellchecker
  str name=namesuggest/str
  str
name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str
name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactor
y
/
str
  str name=fielddc.subject/str
  float name=threshold0.005/float
  str name=buildOnCommittrue/str
/lst
  /searchComponent
  requestHandler
class=org.apache.solr.handler.component.SearchHandler
name=/suggest
lst name=defaults
  str name=spellchecktrue/str
  str name=spellcheck.dictionarysuggest/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.count10/str
  str name=spellcheck.collatetrue/str
/lst
arr name=components
  strsuggest/str
/arr
  /requestHandler

thanks,

Alistair

--
mov eax,1
mov ebx,0
int 80h





-- 
Regards,
Shalin Shekhar Mangar.

Re: Dedupe in a SolrCloud

Comments inline:

On Wed, Jun 17, 2015 at 3:18 PM, Markus.Mirsberger
markus.mirsber...@gmx.de wrote:
 Hi,

 I am trying to use the dedupe feature to detect and mark near duplicate
 content in my collections.
 I dont want to prevent duplicate content. I woud like to detect it and keep
 it for further processing. Thats why Im using an extra field and not the
 documents unique field.

 Here is how I added it to the solrConfig.xml :

  requestHandler name=/update class=solr.UpdateRequestHandler
lst name=defaults
  str name=update.chainfill_signature/str
/lst
  /requestHandler

  updateRequestProcessorChain name=fill_signature
 processor=signature
 processor class=solr.RunUpdateProcessorFactory /
  /updateRequestProcessorChain

  updateProcessor class=solr.processor.SignatureUpdateProcessorFactory
 name=signature
  bool name=enabledtrue/bool
  str name=signatureFieldsignature/str
  bool name=overwriteDupesfalse/bool
  str name=fieldscontent/str
  str
 name=signatureClasssolr.processor.TextProfileSignature/str
  str name=quantRate.2/str
  str name=minTokenLen3/str
  /updateProcessor

 When I initially add the documents to the cloud everything works as expected
 . the documents are added and the signature will be created and
 added.perfect:)
 The problem occours when I want to update an exisiting document. In that
 case the update.chain=fill_signature parameter will of course be set too and
 I get a bad request error.

 I found this solr issue: https://issues.apache.org/jira/browse/SOLR-3473

 Is it that problem I am running into?

You haven't pasted the complete error response so I am guessing a bit
here. It is possible that you are running into the same problem i.e.
the signature is being calculated again and the signature field not
multi-valued, causes an error.

 Is it somehow possible to add parameters or set a specific update Handler
 when Im adding documents to the cloud using solrJ?

Yes, any custom parameter can be added to a SolrJ request. There is a
setParam(String param, String value) method available in
AbstractUpdateRequest which can be used to set a custom update.chain
for each SolrJ request.

 In that case I could ether set the update.chain manually and remove it from
 the request handler or write a second request Handler which I only use if I
 want set the signature field.
 I know I can do that manually when Im using eg curl but is it also possible
 with SolrJ? :)


 Thanks,
 Markus







-- 
Regards,
Shalin Shekhar Mangar.

Re: Multivalued fields order of storing is guaranteed ?

2015-06-17 Thread Alok Bhandari

Thanks Yonik.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Multivalued-fields-order-of-storing-is-guaranteed-tp4212383p4212428.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Multivalued fields order of storing is guaranteed ?

2015-06-17 Thread Yonik Seeley

On Wed, Jun 17, 2015 at 6:44 AM, Alok Bhandari
alokomprakashbhand...@gmail.com wrote:
 Is it guaranteed that stored multivalued fields maintain order of insertion.

Yes.

-Yonik

Re: suggester returning stems instead of whole words

Hmmm, shouldn't be happening that way. Spellcheck is supposed to be
looking at indexed terms. If you go into the admin/schema browser
page and look at the new field, what are the terms in the index? They
shouldn't be stemmed.

And I always get confused where this
  str name=spellcheck.dictionarysuggest/str
is supposed to point. Do you have any other component named suggest
that you might be picking up?

Best,
Erick

On Wed, Jun 17, 2015 at 6:50 AM, Alistair Young
alistair.yo...@uhi.ac.uk wrote:
 yep did both of those things. Getting the same results as using dc.subject

 On 17/06/2015 14:44, Shalin Shekhar Mangar shalinman...@gmail.com
 wrote:

Did you change the SpellCheckComponent's configuration to use
subject_autocomplete instead of dc.subject? After you made that
change, did you invoke spellcheck.build=true to re-build the
spellcheck index?

On Wed, Jun 17, 2015 at 7:06 PM, Alistair Young
alistair.yo...@uhi.ac.uk wrote:
 copyField doesn¹t seem to fix the suggestion stemming. Copying the field
 to another field of this type:

 field name=subject_autocomplete type=text_auto indexed=true
 stored=true multiValued=false /

 copyField source=dc.subject dest=subject_autocomplete /


 fieldType class=solr.TextField name=text_auto
 positionIncrementGap=100
  analyzer
   tokenizer class=solr.WhitespaceTokenizerFactory/
   filter class=solr.WordDelimiterFilterFactory generateWordParts=1
 generateNumberParts=1 catenateWords=1 catenateNumbers=1
 catenateAll=0 splitOnCaseChange=1/
   filter class=solr.LowerCaseFilterFactory/
  /analyzer
 /fieldType


 but I¹m still getting stemmed suggestions after rebuilding the index.

 Alistair

 --
 mov eax,1
 mov ebx,0
 int 80h




 On 17/06/2015 11:28, Alistair Young alistair.yo...@uhi.ac.uk wrote:

ah looks like I need to use copyField to get a non stemmed version of
the
suggester field

Alistair

--
mov eax,1
mov ebx,0
int 80h




On 17/06/2015 11:15, Alistair Young alistair.yo...@uhi.ac.uk wrote:

I was wondering if there's a way to get the suggester to return whole
words. Instead of returning 'technology' , 'temperature' and
'tutorial',
it's returning 'technolog' , 'temperatur' and 'tutori'

using this config:

searchComponent class=solr.SpellCheckComponent name=suggest
lst name=spellchecker
  str name=namesuggest/str
  str
name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str
name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactor
y
/
str
  str name=fielddc.subject/str
  float name=threshold0.005/float
  str name=buildOnCommittrue/str
/lst
  /searchComponent
  requestHandler
class=org.apache.solr.handler.component.SearchHandler
name=/suggest
lst name=defaults
  str name=spellchecktrue/str
  str name=spellcheck.dictionarysuggest/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.count10/str
  str name=spellcheck.collatetrue/str
/lst
arr name=components
  strsuggest/str
/arr
  /requestHandler

thanks,

Alistair

--
mov eax,1
mov ebx,0
int 80h





--
Regards,
Shalin Shekhar Mangar.

Re: ZooKeeper connection refused

Is ZK healthy? Can you try the following from the server on which Solr
is running:

echo ruok | nc zk1 2181

On Wed, Jun 17, 2015 at 7:25 PM, shacky shack...@gmail.com wrote:
 2015-06-17 15:34 GMT+02:00 Shalin Shekhar Mangar shalinman...@gmail.com:
 You are asking telnet to connect to zk1 on port 2181 but you have not
 specified the port to Solr. You should set
 ZK_HOST=zk1:2181,zk2:2181,zk3:2181 instead.

 I modified the ZK_HOST instance with the port, but the problem is not solved.
 Do you have any ideas?



-- 
Regards,
Shalin Shekhar Mangar.

Re: suggester returning stems instead of whole words

looking at the schema browser, subject_autocomplete has a type of text_en
rather than text_auto and all the terms are stemmed. Its contents are the
same as the one it was copied from, dc.subject, which is text_en and
stemmed.

On 17/06/2015 14:58, Erick Erickson erickerick...@gmail.com wrote:

Hmmm, shouldn't be happening that way. Spellcheck is supposed to be
looking at indexed terms. If you go into the admin/schema browser
page and look at the new field, what are the terms in the index? They
shouldn't be stemmed.

And I always get confused where this
  str name=spellcheck.dictionarysuggest/str
is supposed to point. Do you have any other component named suggest
that you might be picking up?

Best,
Erick

On Wed, Jun 17, 2015 at 6:50 AM, Alistair Young
alistair.yo...@uhi.ac.uk wrote:
 yep did both of those things. Getting the same results as using
dc.subject

 On 17/06/2015 14:44, Shalin Shekhar Mangar shalinman...@gmail.com
 wrote:

Did you change the SpellCheckComponent's configuration to use
subject_autocomplete instead of dc.subject? After you made that
change, did you invoke spellcheck.build=true to re-build the
spellcheck index?

On Wed, Jun 17, 2015 at 7:06 PM, Alistair Young
alistair.yo...@uhi.ac.uk wrote:
 copyField doesn¹t seem to fix the suggestion stemming. Copying the
field
 to another field of this type:

 field name=subject_autocomplete type=text_auto indexed=true
 stored=true multiValued=false /

 copyField source=dc.subject dest=subject_autocomplete /


 fieldType class=solr.TextField name=text_auto
 positionIncrementGap=100
  analyzer
   tokenizer class=solr.WhitespaceTokenizerFactory/
   filter class=solr.WordDelimiterFilterFactory
generateWordParts=1
 generateNumberParts=1 catenateWords=1 catenateNumbers=1
 catenateAll=0 splitOnCaseChange=1/
   filter class=solr.LowerCaseFilterFactory/
  /analyzer
 /fieldType


 but I¹m still getting stemmed suggestions after rebuilding the index.

 Alistair

 --
 mov eax,1
 mov ebx,0
 int 80h




 On 17/06/2015 11:28, Alistair Young alistair.yo...@uhi.ac.uk
wrote:

ah looks like I need to use copyField to get a non stemmed version of
the
suggester field

Alistair

--
mov eax,1
mov ebx,0
int 80h




On 17/06/2015 11:15, Alistair Young alistair.yo...@uhi.ac.uk
wrote:

I was wondering if there's a way to get the suggester to return whole
words. Instead of returning 'technology' , 'temperature' and
'tutorial',
it's returning 'technolog' , 'temperatur' and 'tutori'

using this config:

searchComponent class=solr.SpellCheckComponent name=suggest
lst name=spellchecker
  str name=namesuggest/str
  str
name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str
name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFact
or
y
/
str
  str name=fielddc.subject/str
  float name=threshold0.005/float
  str name=buildOnCommittrue/str
/lst
  /searchComponent
  requestHandler
class=org.apache.solr.handler.component.SearchHandler
name=/suggest
lst name=defaults
  str name=spellchecktrue/str
  str name=spellcheck.dictionarysuggest/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.count10/str
  str name=spellcheck.collatetrue/str
/lst
arr name=components
  strsuggest/str
/arr
  /requestHandler

thanks,

Alistair

--
mov eax,1
mov ebx,0
int 80h





--
Regards,
Shalin Shekhar Mangar.

ManagedStopFilterFactory not accepting ignoreCase

2015-06-17 Thread Mike Thomsen

We're running Solr 4.10.4 and getting this...

Caused by: java.lang.IllegalArgumentException: Unknown parameters:
{ignoreCase=true}
at
org.apache.solr.rest.schema.analysis.BaseManagedTokenFilterFactory.init(BaseManagedTokenFilterFactory.java:46)
at
org.apache.solr.rest.schema.analysis.ManagedStopFilterFactory.init(ManagedStopFilterFactory.java:47)

This is the filter definition I used:

filter class=solr.ManagedStopFilterFactory
  ignoreCase=true
  managed=english/

Any ideas?

Thanks,

Mike

Re: Solr's suggester results

2015-06-17 Thread Alessandro Benedetti

Edwin,
The spellcheck is a thing, the Suggester is another.

In the case I would again really suggest you to take a look to the article
I quoted and Solr generic documentation.

It is possible to suggest part of the field.
You can use the FreeText suggester with a proper analysis selected.

Cheers

2015-06-17 6:14 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com:

Yes I've looked at that before, but I was told that the newer version of
Solr has its own suggester, and does not need to use spellchecker anymore?

So it's not necessary to use the spellechecker inside suggester anymore?

Regards,
Edwin

On 17 June 2015 at 11:56, Erick Erickson erickerick...@gmail.com wrote:

Have you looked at spellchecker? Because that sound much more like
what you're asking about than suggester.

Spell checking is more what you're asking for, have you even looked at
that
after it was suggested?

bq: Also, when I do a search, it shouldn't be returning whole fields,
but just to return a portion of the sentence

This is what highlighting is built for.

Really, I recommend you take the time to do some familiarization with the
whole search space and Solr. The excellent book here:

http://www.amazon.com/Solr-Action-Trey-Grainger/dp/1617291021/ref=sr_1_1?ie=UTF8qid=1434513284sr=8-1keywords=apache+solrpebp=1434513287267perid=0YRK508J0HJ1N3BAX20E

will give you the grounding you need to get the most out of Solr.

Best,
Erick

I think the problem is the content size of the PDF file exceed 32766
characters?

Regards,
Edwin

On 16 June 2015 at 23:02, Erick Erickson erickerick...@gmail.com
wrote:

This is an XY problem IMO. Please describe exactly what
you're trying to accomplish, with examples rather than
continue to pursue this path. It sounds like you want
spellcheck or similar. The _point_ behind the
suggesters is that they handle multiple-word suggestions
by returning he whole field. So putting long text fields
into them is not going to work.

Best,
Erick

On Tue, Jun 16, 2015 at 1:46 AM, Alessandro Benedetti
benedetti.ale...@gmail.com wrote:
in line :

2015-06-16 4:43 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com
:

Thanks Benedetti,

I've change to the AnalyzingInfixLookup approach, and it is able to
start
searching from the middle of the field.

However, is it possible to make the suggester to show only part of
the
content of the field (like 2 or 3 fields after), instead of the
entire
content/sentence, which can be quite long?

I assume you use fields in the place of tokens.
The answer is yes, I already said that in my previous mail, I invite
you
to
read carefully the answers and the documentation linked !

Related the excessive dimensions of tokens. This is weird, what are
you
trying to autocomplete ?
I really doubt would be useful for a user to see super long auto
completed
terms.

Cheers

Regards,
Edwin

On 15 June 2015 at 17:33, Alessandro Benedetti
benedetti.ale...@gmail.com

wrote:

ehehe Edwin, I think you should read again the document I linked
time
ago :

http://lucidworks.com/blog/solr-suggester/

The suggester you used is not meant to provide infix suggestions.
The fuzzy suggester is working on a fuzzy basis , with the
*starting*
terms
of a field content.

What you are looking for is actually one of the Infix Suggesters.
For example the AnalyzingInfixLookup approach.

When working with Suggesters is important first to make a
distinction
:

1) Returning the full content of the field ( analysisInfix or
Fuzzy)

2) Returning token(s) ( Free Text Suggester)

Then the second difference is :

1) Infix suggestions ( from the middle of the field

Re: ZooKeeper connection refused

2015-06-17 Thread shacky

2015-06-17 15:34 GMT+02:00 Shalin Shekhar Mangar shalinman...@gmail.com:
 You are asking telnet to connect to zk1 on port 2181 but you have not
 specified the port to Solr. You should set
 ZK_HOST=zk1:2181,zk2:2181,zk3:2181 instead.

I modified the ZK_HOST instance with the port, but the problem is not solved.
Do you have any ideas?

Re: mapreduce job using soirj 5

2015-06-17 Thread Mark Miller

I think there is some better classpath isolation options in the works for
Hadoop. As it is, there is some harmonization that has to be done depending
on versions used, and it can get tricky.

- Mark

On Wed, Jun 17, 2015 at 9:52 AM Erick Erickson erickerick...@gmail.com
wrote:

 For sure there are a few rough edges here

 On Wed, Jun 17, 2015 at 12:28 AM, adfel70 adfe...@gmail.com wrote:
  We cannot downgrade httpclient in solrj5 because its using new features
 and
  we dont want to start altering solr code, anyway we thought about
 upgrading
  httpclient in hadoop but as Erick said its sounds more work than just put
  the jar in the data nodes.
 
  About that flag we tried it, hadoop even has an environment variable
  HADOOP_USER_CLASSPATH_FIRST but all our tests with that flag failed.
 
  We thought this is an issue that is more likely that solr users will
  encounter rather than cloudera users, so we will be glad for a more
 elegant
  solution or workaround than to replace the httpclient jar in the data
 nodes
 
  Thank you all for your responses
 
 
 
  --
  View this message in context:
 http://lucene.472066.n3.nabble.com/mapreduce-job-using-soirj-5-tp4212199p4212350.html
  Sent from the Solr - User mailing list archive at Nabble.com.

-- 
- Mark
about.me/markrmiller

Re: mapreduce job using soirj 5

For sure there are a few rough edges here

On Wed, Jun 17, 2015 at 12:28 AM, adfel70 adfe...@gmail.com wrote:
 We cannot downgrade httpclient in solrj5 because its using new features and
 we dont want to start altering solr code, anyway we thought about upgrading
 httpclient in hadoop but as Erick said its sounds more work than just put
 the jar in the data nodes.

 About that flag we tried it, hadoop even has an environment variable
 HADOOP_USER_CLASSPATH_FIRST but all our tests with that flag failed.

 We thought this is an issue that is more likely that solr users will
 encounter rather than cloudera users, so we will be glad for a more elegant
 solution or workaround than to replace the httpclient jar in the data nodes

 Thank you all for your responses



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/mapreduce-job-using-soirj-5-tp4212199p4212350.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Joins with comma separated values

You can potentially just use a text_general field, in which case your
comma separated words will be effectively a multi-valued field. I
believe that will work.

As to how you want to use joins, that isn't possible. They are pseudo
joins, not full joins. They will not be able to include data from the
joined field in the result.

Upayavira

On jun 6, Advait Suhas Pandit wrote:
 Hi,
 
 We have some master data and some content data. Master data would be
 things like userid, name, email id etc.
 Our content data for example is a blog.
 The blog has certain fields which are comma separated ids that point to
 the master data.
 E.g. UserIDs of people who have commented on a particular blog can be
 found in the blog table in a comma separated field of userids.
 Similarly userids of people who have liked the blog can be found in a
 comma separated field of userids.
 
 How do I join this comma separated list of userids with the master data
 so that I can get the other details of the user such as name, email,
 picture etc?
 
 Thanks,
 Advait

Re: mapreduce job using soirj 5

2015-06-17 Thread adfel70

We cannot downgrade httpclient in solrj5 because its using new features and
we dont want to start altering solr code, anyway we thought about upgrading
httpclient in hadoop but as Erick said its sounds more work than just put
the jar in the data nodes.

About that flag we tried it, hadoop even has an environment variable
HADOOP_USER_CLASSPATH_FIRST but all our tests with that flag failed.

We thought this is an issue that is more likely that solr users will
encounter rather than cloudera users, so we will be glad for a more elegant
solution or workaround than to replace the httpclient jar in the data nodes

Thank you all for your responses



--
View this message in context: 
http://lucene.472066.n3.nabble.com/mapreduce-job-using-soirj-5-tp4212199p4212350.html
Sent from the Solr - User mailing list archive at Nabble.com.

Solr/ZK issues

2015-06-17 Thread Sunil . Srinivasan

Hi Folks,

We are seeing the following in our logs on our Solr nodes after which Solr 
nodes go into multiple full GCs  and eventually runs out of heap. We saw this 
ticket - https://issues.apache.org/jira/browse/SOLR-7338 - wondering that’s the 
one causing it.  We are currently on 4.10.0

INFO  - 2015-06-17 08:06:28.163; 
org.apache.solr.common.cloud.ConnectionManager; Watcher 
org.apache.solr.common.cloud.ConnectionManager@422f41e9 
name:ZooKeeperConnection Watcher:got event WatchedEvent state:Expired type:None 
path:null path:null type:None
INFO  - 2015-06-17 08:06:28.163; 
org.apache.solr.common.cloud.ConnectionManager; Our previous ZooKeeper session 
was expired. Attempting to reconnect to recover relationship with ZooKeeper...
INFO  - 2015-06-17 08:06:28.166; 
org.apache.solr.common.cloud.DefaultConnectionStrategy; Connection expired - 
starting a new one...
INFO  - 2015-06-17 08:06:28.171; 
org.apache.solr.common.cloud.ConnectionManager; Waiting for client to connect 
to ZooKeeper
INFO  - 2015-06-17 08:06:28.177; 
org.apache.solr.common.cloud.ConnectionManager; Watcher 
org.apache.solr.common.cloud.ConnectionManager@422f41e9 
name:ZooKeeperConnection Watcher: got event WatchedEvent state:SyncConnected 
type:None path:null path:null type:None
INFO  - 2015-06-17 08:06:28.177; 
org.apache.solr.common.cloud.ConnectionManager; Client is connected to ZooKeeper
INFO  - 2015-06-17 08:06:28.178; 
org.apache.solr.common.cloud.ConnectionManager$1; Connection with ZooKeeper 
reestablished.
INFO  - 2015-06-17 08:06:28.178; 
org.apache.solr.common.cloud.DefaultConnectionStrategy; Reconnected to ZooKeeper
INFO  - 2015-06-17 08:06:28.179; 
org.apache.solr.common.cloud.ConnectionManager; Connected:true
WARN  - 2015-06-17 08:06:28.179; org.apache.solr.cloud.RecoveryStrategy; 
Stopping recovery for core=category coreNodeName=core_node2
WARN  - 2015-06-17 08:06:28.180; org.apache.solr.cloud.RecoveryStrategy; 
Stopping recovery for core=category_shadow coreNodeName=core_node2
WARN  - 2015-06-17 08:06:28.180; org.apache.solr.cloud.RecoveryStrategy; 
Stopping recovery for core=rules_shadow coreNodeName=core_node2
WARN  - 2015-06-17 08:06:28.180; org.apache.solr.cloud.RecoveryStrategy; 
Stopping recovery for core=rules coreNodeName=core_node2
WARN  - 2015-06-17 08:06:28.180; org.apache.solr.cloud.RecoveryStrategy; 
Stopping recovery for core=catalog_shadow coreNodeName=core_node2
WARN  - 2015-06-17 08:06:28.180; org.apache.solr.cloud.RecoveryStrategy; 
Stopping recovery for core=catalog coreNodeName=core_node2
INFO  - 2015-06-17 08:06:28.180; org.apache.solr.cloud.ZkController; publishing 
core=category state=down collection=category
INFO  - 2015-06-17 08:06:28.180; org.apache.solr.cloud.ZkController; numShards 
not found on descriptor - reading it from system property
INFO  - 2015-06-17 08:06:28.186; org.apache.solr.cloud.ZkController; publishing 
core=category_shadow state=down collection=category_shadow
INFO  - 2015-06-17 08:06:28.186; org.apache.solr.cloud.ZkController; numShards 
not found on descriptor - reading it from system property
INFO  - 2015-06-17 08:06:28.189; org.apache.solr.cloud.ZkController; publishing 
core=rules_shadow state=down collection=rules_shadow
INFO  - 2015-06-17 08:06:28.189; org.apache.solr.cloud.ZkController; numShards 
not found on descriptor - reading it from system property
INFO  - 2015-06-17 08:06:28.191; org.apache.solr.cloud.ZkController; publishing 
core=rules state=down collection=rules
INFO  - 2015-06-17 08:06:28.191; org.apache.solr.cloud.ZkController; numShards 
not found on descriptor - reading it from system property
INFO  - 2015-06-17 08:06:28.193; org.apache.solr.cloud.ZkController; publishing 
core=catalog_shadow state=down collection=catalog_shadow
INFO  - 2015-06-17 08:06:28.193; org.apache.solr.cloud.ZkController; numShards 
not found on descriptor - reading it from system property
INFO  - 2015-06-17 08:06:28.194; org.apache.solr.cloud.ZkController; publishing 
core=catalog state=down collection=catalog
INFO  - 2015-06-17 08:06:28.194; org.apache.solr.cloud.ZkController; numShards 
not found on descriptor - reading it from system property
INFO  - 2015-06-17 08:06:28.198; org.apache.solr.cloud.ZkController; Replica 
core_node2 NOT in leader-initiated recovery, need to wait for leader to see 
down state.
o wait for leader to see down state.
WARN  - 2015-06-17 08:07:51.188; org.apache.solr.cloud.ZkController;
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /collections/rules_shadow/leader_elect/shard1/election
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
at 
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:290)
at 
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:287)

Re: suggester returning stems instead of whole words

working in a tiny tmux window does have some disadvantages, such as losing
one’s place in the file! the subject_autocomplete definition wasn’t inside
fields. Now that it is, everything is working. thanks for listening

Alistair

-- 
mov eax,1
mov ebx,0
int 80h




On 17/06/2015 15:17, Alistair Young alistair.yo...@uhi.ac.uk wrote:

looking at the schema browser, subject_autocomplete has a type of text_en
rather than text_auto and all the terms are stemmed. Its contents are the
same as the one it was copied from, dc.subject, which is text_en and
stemmed.

On 17/06/2015 14:58, Erick Erickson erickerick...@gmail.com wrote:

Hmmm, shouldn't be happening that way. Spellcheck is supposed to be
looking at indexed terms. If you go into the admin/schema browser
page and look at the new field, what are the terms in the index? They
shouldn't be stemmed.

And I always get confused where this
  str name=spellcheck.dictionarysuggest/str
is supposed to point. Do you have any other component named suggest
that you might be picking up?

Best,
Erick

On Wed, Jun 17, 2015 at 6:50 AM, Alistair Young
alistair.yo...@uhi.ac.uk wrote:
 yep did both of those things. Getting the same results as using
dc.subject

 On 17/06/2015 14:44, Shalin Shekhar Mangar shalinman...@gmail.com
 wrote:

Did you change the SpellCheckComponent's configuration to use
subject_autocomplete instead of dc.subject? After you made that
change, did you invoke spellcheck.build=true to re-build the
spellcheck index?

On Wed, Jun 17, 2015 at 7:06 PM, Alistair Young
alistair.yo...@uhi.ac.uk wrote:
 copyField doesn¹t seem to fix the suggestion stemming. Copying the
field
 to another field of this type:

 field name=subject_autocomplete type=text_auto indexed=true
 stored=true multiValued=false /

 copyField source=dc.subject dest=subject_autocomplete /


 fieldType class=solr.TextField name=text_auto
 positionIncrementGap=100
  analyzer
   tokenizer class=solr.WhitespaceTokenizerFactory/
   filter class=solr.WordDelimiterFilterFactory
generateWordParts=1
 generateNumberParts=1 catenateWords=1 catenateNumbers=1
 catenateAll=0 splitOnCaseChange=1/
   filter class=solr.LowerCaseFilterFactory/
  /analyzer
 /fieldType


 but I¹m still getting stemmed suggestions after rebuilding the index.

 Alistair

 --
 mov eax,1
 mov ebx,0
 int 80h




 On 17/06/2015 11:28, Alistair Young alistair.yo...@uhi.ac.uk
wrote:

ah looks like I need to use copyField to get a non stemmed version of
the
suggester field

Alistair

--
mov eax,1
mov ebx,0
int 80h




On 17/06/2015 11:15, Alistair Young alistair.yo...@uhi.ac.uk
wrote:

I was wondering if there's a way to get the suggester to return
whole
words. Instead of returning 'technology' , 'temperature' and
'tutorial',
it's returning 'technolog' , 'temperatur' and 'tutori'

using this config:

searchComponent class=solr.SpellCheckComponent name=suggest
lst name=spellchecker
  str name=namesuggest/str
  str
name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str
name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFac
t
or
y
/
str
  str name=fielddc.subject/str
  float name=threshold0.005/float
  str name=buildOnCommittrue/str
/lst
  /searchComponent
  requestHandler
class=org.apache.solr.handler.component.SearchHandler
name=/suggest
lst name=defaults
  str name=spellchecktrue/str
  str name=spellcheck.dictionarysuggest/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.count10/str
  str name=spellcheck.collatetrue/str
/lst
arr name=components
  strsuggest/str
/arr
  /requestHandler

thanks,

Alistair

--
mov eax,1
mov ebx,0
int 80h





--
Regards,
Shalin Shekhar Mangar.

Re: ManagedStopFilterFactory not accepting ignoreCase

2015-06-17 Thread Steve Rowe

Oh, I see you already did :) - thanks. - Steve

 On Jun 17, 2015, at 11:10 AM, Steve Rowe sar...@gmail.com wrote:
 
 Hi Mike,
 
 Looks like a bug to me - would you please create a JIRA?
 
 Thanks,
 Steve
 
 On Jun 17, 2015, at 10:29 AM, Mike Thomsen mikerthom...@gmail.com wrote:
 
 We're running Solr 4.10.4 and getting this...
 
 Caused by: java.lang.IllegalArgumentException: Unknown parameters:
 {ignoreCase=true}
   at
 org.apache.solr.rest.schema.analysis.BaseManagedTokenFilterFactory.init(BaseManagedTokenFilterFactory.java:46)
   at
 org.apache.solr.rest.schema.analysis.ManagedStopFilterFactory.init(ManagedStopFilterFactory.java:47)
 
 This is the filter definition I used:
 
 filter class=solr.ManagedStopFilterFactory
 ignoreCase=true
 managed=english/
 
 Any ideas?
 
 Thanks,
 
 Mike

Re: suggester returning stems instead of whole words

yep, 4.3.1. The API changed after that so it’s finding the time to rewrite
the entire backend that uses it

On 17/06/2015 16:55, Shalin Shekhar Mangar shalinman...@gmail.com
wrote:

You must be using an old version of Solr. Since Solr 4.8 and beyond,
the fields and types tags have been deprecated and you can place
the field and field type definitions anywhere in the schema.xml.

See http://issues.apache.org/jira/browse/SOLR-5228

On Wed, Jun 17, 2015 at 9:09 PM, Alistair Young
alistair.yo...@uhi.ac.uk wrote:
 working in a tiny tmux window does have some disadvantages, such as
losing
 one’s place in the file! the subject_autocomplete definition wasn’t
inside
 fields. Now that it is, everything is working. thanks for listening

 Alistair

 --
 mov eax,1
 mov ebx,0
 int 80h




 On 17/06/2015 15:17, Alistair Young alistair.yo...@uhi.ac.uk wrote:

looking at the schema browser, subject_autocomplete has a type of
text_en
rather than text_auto and all the terms are stemmed. Its contents are
the
same as the one it was copied from, dc.subject, which is text_en and
stemmed.

On 17/06/2015 14:58, Erick Erickson erickerick...@gmail.com wrote:

Hmmm, shouldn't be happening that way. Spellcheck is supposed to be
looking at indexed terms. If you go into the admin/schema browser
page and look at the new field, what are the terms in the index? They
shouldn't be stemmed.

And I always get confused where this
  str name=spellcheck.dictionarysuggest/str
is supposed to point. Do you have any other component named suggest
that you might be picking up?

Best,
Erick

On Wed, Jun 17, 2015 at 6:50 AM, Alistair Young
alistair.yo...@uhi.ac.uk wrote:
 yep did both of those things. Getting the same results as using
dc.subject

 On 17/06/2015 14:44, Shalin Shekhar Mangar shalinman...@gmail.com
 wrote:

Did you change the SpellCheckComponent's configuration to use
subject_autocomplete instead of dc.subject? After you made that
change, did you invoke spellcheck.build=true to re-build the
spellcheck index?

On Wed, Jun 17, 2015 at 7:06 PM, Alistair Young
alistair.yo...@uhi.ac.uk wrote:
 copyField doesn¹t seem to fix the suggestion stemming. Copying the
field
 to another field of this type:

 field name=subject_autocomplete type=text_auto indexed=true
 stored=true multiValued=false /

 copyField source=dc.subject dest=subject_autocomplete /


 fieldType class=solr.TextField name=text_auto
 positionIncrementGap=100
  analyzer
   tokenizer class=solr.WhitespaceTokenizerFactory/
   filter class=solr.WordDelimiterFilterFactory
generateWordParts=1
 generateNumberParts=1 catenateWords=1 catenateNumbers=1
 catenateAll=0 splitOnCaseChange=1/
   filter class=solr.LowerCaseFilterFactory/
  /analyzer
 /fieldType


 but I¹m still getting stemmed suggestions after rebuilding the
index.

 Alistair

 --
 mov eax,1
 mov ebx,0
 int 80h




 On 17/06/2015 11:28, Alistair Young alistair.yo...@uhi.ac.uk
wrote:

ah looks like I need to use copyField to get a non stemmed version
of
the
suggester field

Alistair

--
mov eax,1
mov ebx,0
int 80h




On 17/06/2015 11:15, Alistair Young alistair.yo...@uhi.ac.uk
wrote:

I was wondering if there's a way to get the suggester to return
whole
words. Instead of returning 'technology' , 'temperature' and
'tutorial',
it's returning 'technolog' , 'temperatur' and 'tutori'

using this config:

searchComponent class=solr.SpellCheckComponent name=suggest
lst name=spellchecker
  str name=namesuggest/str
  str
name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str
name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupF
ac
t
or
y
/
str
  str name=fielddc.subject/str
  float name=threshold0.005/float
  str name=buildOnCommittrue/str
/lst
  /searchComponent
  requestHandler
class=org.apache.solr.handler.component.SearchHandler
name=/suggest
lst name=defaults
  str name=spellchecktrue/str
  str name=spellcheck.dictionarysuggest/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.count10/str
  str name=spellcheck.collatetrue/str
/lst
arr name=components
  strsuggest/str
/arr
  /requestHandler

thanks,

Alistair

--
mov eax,1
mov ebx,0
int 80h





--
Regards,
Shalin Shekhar Mangar.






-- 
Regards,
Shalin Shekhar Mangar.

Re: ManagedStopFilterFactory not accepting ignoreCase

2015-06-17 Thread Steve Rowe

Hi Mike,

Looks like a bug to me - would you please create a JIRA?

Thanks,
Steve

 On Jun 17, 2015, at 10:29 AM, Mike Thomsen mikerthom...@gmail.com wrote:
 
 We're running Solr 4.10.4 and getting this...
 
 Caused by: java.lang.IllegalArgumentException: Unknown parameters:
 {ignoreCase=true}
at
 org.apache.solr.rest.schema.analysis.BaseManagedTokenFilterFactory.init(BaseManagedTokenFilterFactory.java:46)
at
 org.apache.solr.rest.schema.analysis.ManagedStopFilterFactory.init(ManagedStopFilterFactory.java:47)
 
 This is the filter definition I used:
 
 filter class=solr.ManagedStopFilterFactory
  ignoreCase=true
  managed=english/
 
 Any ideas?
 
 Thanks,
 
 Mike

Re: suggester returning stems instead of whole words