Re: IF function and FieldList

2014-05-23 Thread Arcadius Ahouansou
Thanks Erick.

Arcadius.


On 22 May 2014 22:14, Erick Erickson erickerick...@gmail.com wrote:

 Why not just return them all and sort it out on the app layer? Seems
 easier

 Or consider doc transformers I suppose.

 Best,
 Erick

 On Thu, May 22, 2014 at 10:20 AM, Arcadius Ahouansou
 arcad...@menelic.com wrote:
  Hello.
 
  I need to have dynamically assigned field list (fl) depending on the
  existence of a field in the response.
  I need to do something like
 
  fl=if(exists(field0),field0 field1,field2 field3))
 
  The problem is that the if function does not like the space.
  I have tried many combinations like double or quotes around the field
 list:
  fl=if(exists(field0),'field0 field1','field2 field3'))
  or
  fl=if(exists(field0),field0,field1,field2,field3))
 
  or parenthesis etc.
 
  Any help would be very appreciated.
 
  Thanks.
 
  Arcadius.



index a repository of documents(.doc) without using post.jar

2014-05-23 Thread benjelloun
Hello,

I need to index a repository of documents(.doc) without using post.jar, i'm
using Solr with Tomcat6.
maybe its with http REST api, but how to use it?
Thanks for your answer,

Best regards,
Anass BENJELLOUN



--
View this message in context: 
http://lucene.472066.n3.nabble.com/index-a-repository-of-documents-doc-without-using-post-jar-tp4137798.html
Sent from the Solr - User mailing list archive at Nabble.com.


index a repository of documents(.doc) without using post.jar

2014-05-23 Thread benjelloun
Hello,

I need to index a repository of documents(.doc) without using post.jar, i'm
using Solr with Tomcat6.
maybe its with http REST api, but how to use it?
Thanks for your answer,

Best regards,
Anass BENJELLOUN



--
View this message in context: 
http://lucene.472066.n3.nabble.com/index-a-repository-of-documents-doc-without-using-post-jar-tp4137797.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to Disable Commit Option and Just Manage it via SolrConfig?

2014-05-23 Thread Furkan KAMACI
Hi Michael;

I've written an API that users send their request. I resend their queries
into Solr and manage which collection is theirs and drop query parameters
about commit. However users can send commitWithin option within their
request data and I have to analyze the data inside request to disallow it.
That's why I'm looking a solution for it within Solr without customizing it.

Thanks;
Furkan KAMACI


2014-05-22 22:16 GMT+03:00 Michael Della Bitta 
michael.della.bi...@appinions.com:

 Just a thought: If your users can send updates and you can't trust them,
 how can you keep them from deleting all your data?

 I would consider using a servlet filter to inspect the request. That would
 probably be non-trivial if you plan to accept javabin requests as well.

 Michael Della Bitta

 Applications Developer

 o: +1 646 532 3062

 appinions inc.

 “The Science of Influence Marketing”

 18 East 41st Street

 New York, NY 10017

 t: @appinions https://twitter.com/Appinions | g+:
 plus.google.com/appinions
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 
 w: appinions.com http://www.appinions.com/


 On Thu, May 22, 2014 at 6:36 AM, Furkan KAMACI furkankam...@gmail.com
 wrote:

  Hi All;
 
  I've designed a system that allows people to use a search service from
  SolrCloud. However I think that I should disable commit option for
 people
  to avoid performance issues (many users can send commit requests and this
  may cause to performance issues). I'll configure solr config file with
  autocommit and I'll not let people to commit individually.
 
  I've done some implementation for it and people can not send commit
 request
  by GET as like:
 
  localhost:8983/solr/update*?commit=true*
 
  and they can not use:
 
  HttpSolrServer solrServer = new HttpSolrServer(
 http://localhost:8983/solr
  );
  solrServer*.commit();*
 
  I think that there is another way to send a commit request to Solr. It is
  something like:
 
  {add:{ doc:{id:change.me,title:change.me
  },boost:1.0,overwrite:true,*commitWithin*:1000}}
 
  So, I want to stop that usage and my current implementation does not
  provide it.
 
  My question is that: Is there anyway I can close the commit option for
 Solr
  from clients/outside the world of Solr and manage that option only
 via
  solr config?
 
  Thanks;
  Furkan KAMACI
 



Re: index a repository of documents(.doc) without using post.jar

2014-05-23 Thread Alexandre Rafalovitch
Post jar is just there for convenience. Look at the relevant WIKI
pages for actual URL examples:
https://wiki.apache.org/solr/UpdateXmlMessages

Regards,
   Alex
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Fri, May 23, 2014 at 3:36 PM, benjelloun anass@gmail.com wrote:
 Hello,

 I need to index a repository of documents(.doc) without using post.jar, i'm
 using Solr with Tomcat6.
 maybe its with http REST api, but how to use it?
 Thanks for your answer,

 Best regards,
 Anass BENJELLOUN



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/index-a-repository-of-documents-doc-without-using-post-jar-tp4137797.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Import data from Mysql concat issues

2014-05-23 Thread anarchos78
Hi,

I'm trying to index data from mysql. The indexing  is successful. Then I
tried to use the mysql concat function (data-config.xml) in order to
concatenate a custom string with a field like this: *CONCAT('(',
CAST('ΤΜΗΜΑ' AS CHAR CHARACTER SET utf8), ' ', apofasi_tmima, ')') *. The
custom string ('ΤΜΗΜΑ') in Greek. Then when I try to query the field solr
returns ? instead of ΤΜΗΜΑ. I have also use this: *CONCAT('(',
'ΤΜΗΜΑ', ' ', apofasi_tmima, ')')* with no success.
The data-config.xml file is utf-8 encoded and at the beginning there is
the ?xml version=1.0 encoding=UTF-8? xml directive. I have also
tried to set in the dataSource url “characterEncoding=utf8” but indexing
fails.
What am I missing here? Is there any workaround for this?
Below is a snippet from data-config.xml:

?xml version=1.0 encoding=UTF-8?
dataConfig
   dataSource type=JdbcDataSource autoCommit=true batchSize=-1
convertType=false driver=com.mysql.jdbc.Driver
url=jdbc:mysql://127.0.0.1:3306/apofaseis?zeroDateTimeBehavior=convertToNull
user=root password= name=db /
   dataSource name=fieldReader type=FieldStreamDataSource /
  document
   
  entity name= apofaseis _2000 
dataSource=db 
transformer=HTMLStripTransformer
query=select id, CONCAT_WS('', CONCAT(apofasi_number, '/', 
apofasi_date,
' ', (CASE apofasi_tmima WHEN NULL THEN '' WHEN '' THEN '' ELSE CONCAT('(',
CAST('ΤΜΗΜΑ' AS CHAR CHARACTER SET utf8), ' ', apofasi_tmima, ')') END))) AS
grid_title, CAST(CONCAT_WS('_',id,model) AS CHAR) AS solr_id,
apofasi_number, apofasi_date, apofasi_tmima, CONCAT(IFNULL(apofasi_thema,
''), ' ', IFNULL(apofasi_description, ''), ' ', apofasi_body) AS content,
type, model, url, search_tag, last_modified, CONCAT_WS('',
CONCAT(apofasi_number, '/', apofasi_date,   ' ', (CASE apofasi_tmima WHEN 
NULL
THEN'' WHEN '' THEN '' ELSE CONCAT('(', CAST('ΤΜΗΜΑ' AS CHAR CHARACTER SET
utf8), ' ', apofasi_tmima, ')') END))) AS title from apofaseis _2000 where
type = 'text' 
...
...


Regards,
anarchos78




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Import-data-from-Mysql-concat-issues-tp4137814.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to Disable Commit Option and Just Manage it via SolrConfig?

2014-05-23 Thread Jack Krupansky
There is no direct Solr configuration option to disable commit requests that 
I know of.


Maybe you could do it with an update processor. The ProcessAdd method is 
called to process a document; it is passed an AddUpdateCommand object for a 
single document and has a field for the commitWithin setting. I don't see a 
public method for zapping commitWithin, but I didn't look too deeply. Worst 
case, you would need to substitute your own equivalent of 
RunUpdateProcessorFactory that ignores the commitWithin setting on 
ProcessAdd; maybe you could subclass and extend the existing class, or maybe 
you would have to copy and edit it.


Also, note that the delete command also has a commitWithin setting.

-- Jack Krupansky

-Original Message- 
From: Furkan KAMACI

Sent: Thursday, May 22, 2014 6:36 AM
To: solr-user@lucene.apache.org
Subject: How to Disable Commit Option and Just Manage it via SolrConfig?

Hi All;

I've designed a system that allows people to use a search service from
SolrCloud. However I think that I should disable commit option for people
to avoid performance issues (many users can send commit requests and this
may cause to performance issues). I'll configure solr config file with
autocommit and I'll not let people to commit individually.

I've done some implementation for it and people can not send commit request
by GET as like:

localhost:8983/solr/update*?commit=true*

and they can not use:

HttpSolrServer solrServer = new HttpSolrServer(http://localhost:8983/solr
);
solrServer*.commit();*

I think that there is another way to send a commit request to Solr. It is
something like:

{add:{ doc:{id:change.me,title:change.me
},boost:1.0,overwrite:true,*commitWithin*:1000}}

So, I want to stop that usage and my current implementation does not
provide it.

My question is that: Is there anyway I can close the commit option for Solr
from clients/outside the world of Solr and manage that option only via
solr config?

Thanks;
Furkan KAMACI 



Re: index a repository of documents(.doc) without using post.jar

2014-05-23 Thread Jack Krupansky
Is there a particular reason you are adverse to using post.jar? I mean, if 
there is some bug or inconvenience, let us know so we can fix it!


The Solr server itself does not provide any ability to crawl file systems 
(LucidWorks Search does.) post.jar does provide that convenience.


-- Jack Krupansky

-Original Message- 
From: benjelloun

Sent: Friday, May 23, 2014 4:36 AM
To: solr-user@lucene.apache.org
Subject: index a repository of documents(.doc) without using post.jar

Hello,

I need to index a repository of documents(.doc) without using post.jar, i'm
using Solr with Tomcat6.
maybe its with http REST api, but how to use it?
Thanks for your answer,

Best regards,
Anass BENJELLOUN



--
View this message in context: 
http://lucene.472066.n3.nabble.com/index-a-repository-of-documents-doc-without-using-post-jar-tp4137797.html
Sent from the Solr - User mailing list archive at Nabble.com. 



java.io.EOFException: seek past EOF

2014-05-23 Thread aarthi
Hi
We are getting the seek past EOF exception in solr. This occurs randomly and
after a reindex we are able to access data again. After running Check Index,
we got no corrupt blocks. Kindly throw light on the issue.The following is
the error log:


2014-05-21 13:57:29,172 INFO processor.LogUpdateProcessor - [LucidWorksLogs]
webapp= path=/update
params={waitSearcher=truecommit=truewt=javabinexpungeDeletes=falsecommit_end_point=trueversion=2softCommit=falseupdate.chain=lucid-update-chain}
{commit=} 0 14
2014-05-21 13:57:56,139 ERROR core.SolrCore - java.io.EOFException: seek
past EOF:
MMapIndexInput(path=/xxx/xxx/xxx/LucidWorks/LucidWorksSearch/conf/solr/test_Index/data/index.20140515122858307/_cgx_Lucene41_0.doc)
at
org.apache.lucene.store.ByteBufferIndexInput.seek(ByteBufferIndexInput.java:174)
at
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.reset(Lucene41PostingsReader.java:407)
at
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.docs(Lucene41PostingsReader.java:293)
at
org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.docs(BlockTreeTermsReader.java:2188)
at
org.apache.lucene.search.FieldCacheImpl$SortedDocValuesCache.createValue(FieldCacheImpl.java:1240)
at
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:212)
at
org.apache.lucene.search.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:1167)
at
org.apache.lucene.search.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:1147)
at
org.apache.lucene.search.FieldComparator$TermOrdValComparator.setNextReader(FieldComparator.java:1056)
at
org.apache.lucene.search.grouping.AbstractFirstPassGroupingCollector.setNextReader(AbstractFirstPassGroupingCollector.java:332)
at
org.apache.lucene.search.grouping.term.TermFirstPassGroupingCollector.setNextReader(TermFirstPassGroupingCollector.java:89)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:615)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
at 
org.apache.solr.search.Grouping.searchWithTimeLimiter(Grouping.java:426)
at org.apache.solr.search.Grouping.execute(Grouping.java:348)
at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:408)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
at
com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
at
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
at
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
at org.eclipse.jetty.server.Server.handle(Server.java:351)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
at
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:890)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:944)
at 

Re: index a repository of documents(.doc) without using post.jar

2014-05-23 Thread benjelloun
Hello,

There is no inconvenience, i just need to index some files from the system
using JEE and tomcat6, maybe there is a fonction which call HTTP REST.
Maybe there is a solution to integrate post.jar to tomcat6.
Please if you know any solution to my probleme, suggest it to me.
Thanks,

Best regards,
Anass BENJELLOUN



--
View this message in context: 
http://lucene.472066.n3.nabble.com/index-a-repository-of-documents-doc-without-using-post-jar-tp4137797p4137848.html
Sent from the Solr - User mailing list archive at Nabble.com.


Fwd: Question to send

2014-05-23 Thread rashi gandhi
HI,



I have one running solr core with some data indexed on solr server.

This core  is designed to provide OpenNLP functionalities for indexing and
searching.

So I have kept following binary models at this location:
*\apache-tomcat-7.0.53\solr\collection1\conf\opennlp
*

· en-sent.bin

· en-token.bin

· en-pos-maxent.bin

· en-ner-person.bin

· en-ner-location.bin



*My Problem is*: When I unload the running core, and try to delete conf
directory from it.

It is not allowing me to delete directory with prompt that *en-sent.bin*and
*en-token.bin* is in use.

If I have unloaded core, then why it is not unlocking the connection with
core?

Is this a known issue with OpenNLP Binaries?

How can I release the connection between unloaded core and conf directory.
(Specially binary models)



Please provide me some pointers on this.

Thanks in Advance


Re: index a repository of documents(.doc) without using post.jar

2014-05-23 Thread Jack Krupansky
Feel free to look at the source code for post.jar. I mean, all it is really 
doing is scanning the directory (optionally recursively) and then streaming 
each file to Solr.


-- Jack Krupansky

-Original Message- 
From: benjelloun

Sent: Friday, May 23, 2014 8:15 AM
To: solr-user@lucene.apache.org
Subject: Re: index a repository of documents(.doc) without using post.jar

Hello,

There is no inconvenience, i just need to index some files from the system
using JEE and tomcat6, maybe there is a fonction which call HTTP REST.
Maybe there is a solution to integrate post.jar to tomcat6.
Please if you know any solution to my probleme, suggest it to me.
Thanks,

Best regards,
Anass BENJELLOUN



--
View this message in context: 
http://lucene.472066.n3.nabble.com/index-a-repository-of-documents-doc-without-using-post-jar-tp4137797p4137848.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: index a repository of documents(.doc) without using post.jar

2014-05-23 Thread Ahmet Arslan
Hey Anass,

Have look at another Apache project : http://manifoldcf.apache.org

It works with Tomcat/Solr. It is handy to handle deletions and incremental 
updates.


On Friday, May 23, 2014 3:41 PM, benjelloun anass@gmail.com wrote:



Hello,

There is no inconvenience, i just need to index some files from the system
using JEE and tomcat6, maybe there is a fonction which call HTTP REST.
Maybe there is a solution to integrate post.jar to tomcat6.
Please if you know any solution to my probleme, suggest it to me.
Thanks,

Best regards,
Anass BENJELLOUN



--
View this message in context: 
http://lucene.472066.n3.nabble.com/index-a-repository-of-documents-doc-without-using-post-jar-tp4137797p4137848.html

Sent from the Solr - User mailing list archive at Nabble.com.


Change the group.field name in the solr response

2014-05-23 Thread Prathik Puthran
Hi,

How can I change the field name in the grouped section of the solr
response.
I know for changing the field names in the response where solr returns
documents you can make a query with fl changed as
fl=mapping1:fieldname1,mapping2:fieldname2

How do I achieve the same thing for grouping?

For eg: If the solr returns the below response for grouped section when I
send the query with group.field=fieldname

grouped: {fieldname: {matches: 1,ngroups: 1,groups: [{groupValue
: 11254,doclist: {numFound: 1,start: 0,docs: [{store_id: 101,
name: tubelight ,fieldname: 14}]}}]}},

I want solr to change the fieldname in the response to say some other
value I specify in the query.
How can I achieve this?

Thanks,
Prathik


Re: index a repository of documents(.doc) without using post.jar

2014-05-23 Thread benjelloun
Hello,

I looked to source code of post.jar, that was very interesting.
I looked for manifoldcf apache, that was interesting too.
But i what i want to do is indexing some files using http rest, this is my
request which dont work, maybe this way is the easiest for implementation:

put: localhost:8080/solr/update?commit=true
add
  doc
field name=titlekhalid/field
field name=descriptionbouchna9 /field
field name=date23/05/2014 /field
  /doc
/add

I'm using dev http client for test.
Thanks,
Anass BENJELLOUN




--
View this message in context: 
http://lucene.472066.n3.nabble.com/index-a-repository-of-documents-doc-without-using-post-jar-tp4137797p4137881.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How does query on few-hits AND many-hits work

2014-05-23 Thread Per Steffensen
I can answer some of this myself now that I have dived into it to 
understand what Solr/Lucene does and to see if it can be done better
* In current Solr/Lucene (or at least in 4.4) indices on both 
no_dlng_doc_ind_sto and timestamp_dlng_doc_ind_sto are used and the 
doc-id-sets found are intersected to get the final set of doc-ids
* It IS more efficient to just use the index for the 
no_dlng_doc_ind_sto-part of the request to get doc-ids that match that 
part and then fetch timestamp-doc-values for those doc-ids to filter out 
the docs that does not match the timestamp_dlng_doc_ind_sto-part of 
the query. I have made changes to our version of Solr (and Lucene) to do 
that and response-times go from about 10 secs to about 1 sec (of course 
dependent on whats in file-cache etc.) - in cases where 
no_dlng_doc_ind_sto hit about 500-1000 docs and 
timestamp_dlng_doc_ind_sto hit about 3-4 billion.


Regards, Per Steffensen

On 19/05/14 13:33, Per Steffensen wrote:

Hi

Lets say I have a Solr collection (running across several servers) 
containing 5 billion documents. A.o. each document have a value for 
field no_dlng_doc_ind_sto (a long) and field 
timestamp_dlng_doc_ind_sto (also a long). Both no_dlng_doc_ind_sto 
and timestamp_dlng_doc_ind_sto are doc-value, indexed and stored. 
Like this in schema.xml
dynamicField name=*_dlng_doc_ind_sto type=dlng indexed=true 
stored=true required=true docValues=true/
fieldType name=dlng class=solr.TrieLongField precisionStep=0 
positionIncrementGap=0 docValuesFormat=Disk/


I make queries like this: no_dlng_doc_ind_sto:(NO) AND 
timestamp_dlng_doc_ind_sto:([TIME_START TO TIME_END])
* The no_dlng_doc_ind_sto:(NO)-part of a typical query will hit 
between 500 and 1000 documents out of the total 5 billion
* The timestamp_dlng_doc_ind_sto:([TIME_START TO TIME_END])-part 
of a typical query will hit between 3-4 billion documents out of the 
total 5 billion


Question is how Solr/Lucene deals with such requests?
I am thinking that using the indices on both no_dlng_doc_ind_sto and 
timestamp_dlng_doc_ind_sto to get two sets of doc-ids and then make 
an intersection of those might not be the most efficient. You are 
making an intersection of two doc-id-sets of size 500-1000 and 3-4 
billion. It might be faster to just use the index for 
no_dlng_doc_ind_sto to get the doc-ids for the 500-1000 documents, 
then for each of those fetch their timestamp_dlng_doc_ind_sto-value 
(using doc-value) to filter out the ones among the 500-1000 that does 
not match the timestamp-part of the query.
But what does Solr/Lucene actually do? Is it Solr- or Lucene-code that 
make the decision on what to do? Can you somehow hint the 
search-engine that you want one or the other method used?


Solr 4.4 (and corresponding Lucene), BTW, if that makes a difference

Regards, Per Steffensen





RE: How does query on few-hits AND many-hits work

2014-05-23 Thread Toke Eskildsen
Per Steffensen [st...@designware.dk] wrote:
 * It IS more efficient to just use the index for the
 no_dlng_doc_ind_sto-part of the request to get doc-ids that match that
 part and then fetch timestamp-doc-values for those doc-ids to filter out
 the docs that does not match the timestamp_dlng_doc_ind_sto-part of
 the query.

Thank you for the follow up. It sounds rather special-case though, with 
requirement of DocValues for the range-field. Do you think this can be 
generalized?

- Toke Eskildsen


Re: How does query on few-hits AND many-hits work

2014-05-23 Thread Yonik Seeley
On Fri, May 23, 2014 at 11:37 AM, Toke Eskildsen t...@statsbiblioteket.dk 
wrote:
 Per Steffensen [st...@designware.dk] wrote:
 * It IS more efficient to just use the index for the
 no_dlng_doc_ind_sto-part of the request to get doc-ids that match that
 part and then fetch timestamp-doc-values for those doc-ids to filter out
 the docs that does not match the timestamp_dlng_doc_ind_sto-part of
 the query.

 Thank you for the follow up. It sounds rather special-case though, with 
 requirement of DocValues for the range-field. Do you think this can be 
 generalized?

Maybe it already is?
http://heliosearch.org/advanced-filter-caching-in-solr/

Something like this:
 fq={!frange cache=false cost=150 v=timestampField l=beginTime u=endTime}


-Yonik
http://heliosearch.org - facet functions, subfacets, off-heap filtersfieldcache


Internals about Too many values for UnInvertedField faceting on field xxx

2014-05-23 Thread 张月祥
Could anybody tell us some internals about Too many values for
UnInvertedField faceting on field xxx ?

 

We have two solr servers.

 

Solr A :  

 

128G RAM, 60M docs, 2600 different terms with field “code”,  every term of
field “code” has fixed length 6.

the sum count of token of field “code” is 9 Billions. 

The total space used by field “code” is 50 Billions.

 

 

Solr B:  

 

128G RAM, 140M docs,1600 different terms with field “code” every term of
field “code” has fixed length 6.

the sum count of token of field “code” is 18 Billions

The total space of field “code” is 90 Billions.

 

 

When we do facet query “
q=*:*wt=xmlindent=truefacet=truefacet.field=code”  

Solr B is OK,  BUT Solr A meets Exception with the message “Too many values
for UnInvertedField faceting on field code”.

 

Now we think the limitation of UnInvertedField is related with the number of
different terms with one field

 

Could anybody tell us some internals about this problem? We won’t to use
facet.method=enum because it ‘s too slow to use.

 

Thanks!



RE: Internals about Too many values for UnInvertedField faceting on field xxx

2014-05-23 Thread Toke Eskildsen
张月祥 [zhan...@calis.edu.cn] wrote:
 Could anybody tell us some internals about Too many values for
 UnInvertedField faceting on field xxx ?

I must admit I do not fully understand it in detail, but it is a known problem 
with Field Cache (facet.method=fc) faceting. The remedy is to use DocValues, 
which does not have the same limitation. This should also result in lower heap 
usage. You will have to re-index everything though.

We have successfully used DocValues on an index with 400M documents and 300M 
unique values on a single facet field.

- Toke Eskildsen


Solr 4.7.2 ValueSourceParser classCast exception

2014-05-23 Thread Summer Shire
Hi All,

I have my own popularity value source class 
and I let solr know about it via solrconfig.xml


valueSourceParser name=popularity 
class=mysolr.sources.PopValueSourceParser /

But then I get the following class cast exception

I have tried to make sure there are no old Solr jar files in the classpath.

Why would this be happening ? 
 
I even tried to use the lib tag to hard code the solr and solrj jars for 4.7.2

org.apache.solr.common.SolrException: Error Instantiating ValueSourceParser, 
mysolr.sources.PopValueSourceParser failed to instantiate 
org.apache.solr.search.ValueSourceParser
at org.apache.solr.core.SolrCore.init(SolrCore.java:844)
at org.apache.solr.core.SolrCore.init(SolrCore.java:630)
at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:562)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:597)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.apache.solr.common.SolrException: Error Instantiating 
ValueSourceParser, mysolr.sources.PopValueSourceParser failed to instantiate 
org.apache.solr.search.ValueSourceParser
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:552)
at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:587)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2191)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2185)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2218)
at org.apache.solr.core.SolrCore.initValueSourceParsers(SolrCore.java:2130)
at org.apache.solr.core.SolrCore.init(SolrCore.java:765)
... 13 more
Caused by: java.lang.ClassCastException: class 
mysolr.sources.PopValueSourceParser
at java.lang.Class.asSubclass(Class.java:3018)
at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:454)
at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:401)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:531)
... 19 more
MySolr[46778:5844 0] 2014/05/22 15:47:28 717.16 MB/4.09 GB ERROR 
org.apache.solr.core.CoreContainer- 
null:org.apache.solr.common.SolrException: Unable to create core: core1
at org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:989)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:606)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)


Thanks,
Summer


Re: Import data from Mysql concat issues

2014-05-23 Thread Erick Erickson
Couple of possibilities:

1 The data in Solr is fine. However, your browser is getting the
proper characters back but is not set up to handle the proper
character set so displays .

2 Your servlet container is not set up (either inbound or outbound)
to handle the character set you're sending it.


Best,
Erick

On Fri, May 23, 2014 at 3:18 AM, anarchos78
rigasathanasio...@hotmail.com wrote:
 Hi,

 I'm trying to index data from mysql. The indexing  is successful. Then I
 tried to use the mysql concat function (data-config.xml) in order to
 concatenate a custom string with a field like this: *CONCAT('(',
 CAST('ΤΜΗΜΑ' AS CHAR CHARACTER SET utf8), ' ', apofasi_tmima, ')') *. The
 custom string ('ΤΜΗΜΑ') in Greek. Then when I try to query the field solr
 returns ? instead of ΤΜΗΜΑ. I have also use this: *CONCAT('(',
 'ΤΜΗΜΑ', ' ', apofasi_tmima, ')')* with no success.
 The data-config.xml file is utf-8 encoded and at the beginning there is
 the ?xml version=1.0 encoding=UTF-8? xml directive. I have also
 tried to set in the dataSource url “characterEncoding=utf8” but indexing
 fails.
 What am I missing here? Is there any workaround for this?
 Below is a snippet from data-config.xml:

 ?xml version=1.0 encoding=UTF-8?
 dataConfig
dataSource type=JdbcDataSource autoCommit=true batchSize=-1
 convertType=false driver=com.mysql.jdbc.Driver
 url=jdbc:mysql://127.0.0.1:3306/apofaseis?zeroDateTimeBehavior=convertToNull
 user=root password= name=db /
dataSource name=fieldReader type=FieldStreamDataSource /
   document

   entity name= apofaseis _2000 
 dataSource=db
 transformer=HTMLStripTransformer
 query=select id, CONCAT_WS('', CONCAT(apofasi_number, '/', 
 apofasi_date,
 ' ', (CASE apofasi_tmima WHEN NULL THEN '' WHEN '' THEN '' ELSE CONCAT('(',
 CAST('ΤΜΗΜΑ' AS CHAR CHARACTER SET utf8), ' ', apofasi_tmima, ')') END))) AS
 grid_title, CAST(CONCAT_WS('_',id,model) AS CHAR) AS solr_id,
 apofasi_number, apofasi_date, apofasi_tmima, CONCAT(IFNULL(apofasi_thema,
 ''), ' ', IFNULL(apofasi_description, ''), ' ', apofasi_body) AS content,
 type, model, url, search_tag, last_modified, CONCAT_WS('',
 CONCAT(apofasi_number, '/', apofasi_date,   ' ', (CASE apofasi_tmima WHEN 
 NULL
 THEN'' WHEN '' THEN '' ELSE CONCAT('(', CAST('ΤΜΗΜΑ' AS CHAR CHARACTER SET
 utf8), ' ', apofasi_tmima, ')') END))) AS title from apofaseis _2000 where
 type = 'text'
 ...
 ...


 Regards,
 anarchos78




 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Import-data-from-Mysql-concat-issues-tp4137814.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: java.io.EOFException: seek past EOF

2014-05-23 Thread Erick Erickson
What version of Solr are you using? There were some issues like this
in the 4.1 time-frame.

Best,
Erick

On Fri, May 23, 2014 at 3:39 AM, aarthi aarthiran...@gmail.com wrote:
 Hi
 We are getting the seek past EOF exception in solr. This occurs randomly and
 after a reindex we are able to access data again. After running Check Index,
 we got no corrupt blocks. Kindly throw light on the issue.The following is
 the error log:


 2014-05-21 13:57:29,172 INFO processor.LogUpdateProcessor - [LucidWorksLogs]
 webapp= path=/update
 params={waitSearcher=truecommit=truewt=javabinexpungeDeletes=falsecommit_end_point=trueversion=2softCommit=falseupdate.chain=lucid-update-chain}
 {commit=} 0 14
 2014-05-21 13:57:56,139 ERROR core.SolrCore - java.io.EOFException: seek
 past EOF:
 MMapIndexInput(path=/xxx/xxx/xxx/LucidWorks/LucidWorksSearch/conf/solr/test_Index/data/index.20140515122858307/_cgx_Lucene41_0.doc)
 at
 org.apache.lucene.store.ByteBufferIndexInput.seek(ByteBufferIndexInput.java:174)
 at
 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.reset(Lucene41PostingsReader.java:407)
 at
 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.docs(Lucene41PostingsReader.java:293)
 at
 org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.docs(BlockTreeTermsReader.java:2188)
 at
 org.apache.lucene.search.FieldCacheImpl$SortedDocValuesCache.createValue(FieldCacheImpl.java:1240)
 at
 org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:212)
 at
 org.apache.lucene.search.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:1167)
 at
 org.apache.lucene.search.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:1147)
 at
 org.apache.lucene.search.FieldComparator$TermOrdValComparator.setNextReader(FieldComparator.java:1056)
 at
 org.apache.lucene.search.grouping.AbstractFirstPassGroupingCollector.setNextReader(AbstractFirstPassGroupingCollector.java:332)
 at
 org.apache.lucene.search.grouping.term.TermFirstPassGroupingCollector.setNextReader(TermFirstPassGroupingCollector.java:89)
 at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:615)
 at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
 at 
 org.apache.solr.search.Grouping.searchWithTimeLimiter(Grouping.java:426)
 at org.apache.solr.search.Grouping.execute(Grouping.java:348)
 at
 org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:408)
 at
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
 at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
 at
 com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
 at
 com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
 at
 com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
 at 
 com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
 at
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
 at
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
 at
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
 at
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
 at
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
 at
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
 at
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
 at
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
 at
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
 at
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
 at
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
 at
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
 at
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
 at org.eclipse.jetty.server.Server.handle(Server.java:351)
 at
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
 at
 

Re: Import data from Mysql concat issues

2014-05-23 Thread anarchos78
I think that it happens at index time. The reason is that when i query for
the specific field solr returns the ? string!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Import-data-from-Mysql-concat-issues-tp4137814p4137908.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: index a repository of documents(.doc) without using post.jar

2014-05-23 Thread Michael Della Bitta
There's an example of using curl to make a REST call to update a core on
this page:

https://wiki.apache.org/solr/UpdateXmlMessages

If that doesn't help, please let us know what error you're receiving.


Michael Della Bitta

Applications Developer

o: +1 646 532 3062

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions https://twitter.com/Appinions | g+:
plus.google.com/appinionshttps://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com http://www.appinions.com/


On Fri, May 23, 2014 at 10:42 AM, benjelloun anass@gmail.com wrote:

 Hello,

 I looked to source code of post.jar, that was very interesting.
 I looked for manifoldcf apache, that was interesting too.
 But i what i want to do is indexing some files using http rest, this is my
 request which dont work, maybe this way is the easiest for implementation:

 put: localhost:8080/solr/update?commit=true
 add
   doc
 field name=titlekhalid/field
 field name=descriptionbouchna9 /field
 field name=date23/05/2014 /field
   /doc
 /add

 I'm using dev http client for test.
 Thanks,
 Anass BENJELLOUN




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/index-a-repository-of-documents-doc-without-using-post-jar-tp4137797p4137881.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr 4.7.2 ValueSourceParser classCast exception

2014-05-23 Thread Jack Krupansky
Are you sure that you compiled your code with the proper Solr jars so that 
the class signature (extends, implements, and constructors) matches the Solr 
4.7.2 jars? I mean, Java is simply complaining that your class is not a 
valid value source class of the specified type.


-- Jack Krupansky

-Original Message- 
From: Summer Shire

Sent: Friday, May 23, 2014 12:40 PM
To: solr-user@lucene.apache.org
Subject: Solr 4.7.2 ValueSourceParser classCast exception

Hi All,

I have my own popularity value source class
and I let solr know about it via solrconfig.xml


valueSourceParser name=popularity 
class=mysolr.sources.PopValueSourceParser /


But then I get the following class cast exception

I have tried to make sure there are no old Solr jar files in the classpath.

Why would this be happening ?

I even tried to use the lib tag to hard code the solr and solrj jars for 
4.7.2


org.apache.solr.common.SolrException: Error Instantiating ValueSourceParser, 
mysolr.sources.PopValueSourceParser failed to instantiate 
org.apache.solr.search.ValueSourceParser

at org.apache.solr.core.SolrCore.init(SolrCore.java:844)
at org.apache.solr.core.SolrCore.init(SolrCore.java:630)
at 
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:562)

at org.apache.solr.core.CoreContainer.create(CoreContainer.java:597)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:680)
Caused by: org.apache.solr.common.SolrException: Error Instantiating 
ValueSourceParser, mysolr.sources.PopValueSourceParser failed to instantiate 
org.apache.solr.search.ValueSourceParser

at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:552)
at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:587)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2191)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2185)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2218)
at org.apache.solr.core.SolrCore.initValueSourceParsers(SolrCore.java:2130)
at org.apache.solr.core.SolrCore.init(SolrCore.java:765)
... 13 more
Caused by: java.lang.ClassCastException: class 
mysolr.sources.PopValueSourceParser

at java.lang.Class.asSubclass(Class.java:3018)
at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:454)
at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:401)

at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:531)
... 19 more
MySolr[46778:5844 0] 2014/05/22 15:47:28 717.16 MB/4.09 GB ERROR 
org.apache.solr.core.CoreContainer- 
null:org.apache.solr.common.SolrException: Unable to create core: core1

at org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:989)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:606)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:680)


Thanks,
Summer 



Re: Question to send

2014-05-23 Thread Shalin Shekhar Mangar
You'll have better luck asking the folks at OpenNLP. This isn't really a
Solr question.


On Fri, May 23, 2014 at 6:38 PM, rashi gandhi gandhirash...@gmail.comwrote:

 HI,



 I have one running solr core with some data indexed on solr server.

 This core  is designed to provide OpenNLP functionalities for indexing and
 searching.

 So I have kept following binary models at this location:
 *\apache-tomcat-7.0.53\solr\collection1\conf\opennlp
 *

 · en-sent.bin

 · en-token.bin

 · en-pos-maxent.bin

 · en-ner-person.bin

 · en-ner-location.bin



 *My Problem is*: When I unload the running core, and try to delete conf
 directory from it.

 It is not allowing me to delete directory with prompt that *en-sent.bin*and
 *en-token.bin* is in use.

 If I have unloaded core, then why it is not unlocking the connection with
 core?

 Is this a known issue with OpenNLP Binaries?

 How can I release the connection between unloaded core and conf directory.
 (Specially binary models)



 Please provide me some pointers on this.

 Thanks in Advance




-- 
Regards,
Shalin Shekhar Mangar.


Re: Import data from Mysql concat issues

2014-05-23 Thread Erick Erickson
bq: I think that it happens at index time

How do you know that? If you're looking at the results in a browser
you do _not_ know that. If you're looking at the raw values in, say,
SolrJ then you _might_ know that, there's still the issue of whether
you're sending the docs to Solr and your servlet container is munging
them. In this latter case, you're right the info is indexed that way.

Either way, it's not Solr's problem. Solr works just fine with utf-8.
So if the data is getting into the index with weird characters, it's
your setup. If it's just a browser problem, it's _still_ a problem
with your setup.

FWIW,
Erick

On Fri, May 23, 2014 at 10:36 AM, anarchos78
rigasathanasio...@hotmail.com wrote:
 I think that it happens at index time. The reason is that when i query for
 the specific field solr returns the ? string!



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Import-data-from-Mysql-concat-issues-tp4137814p4137908.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Import data from Mysql concat issues

2014-05-23 Thread anarchos78
Tomcat setup is fine. I insist that it's Solr's issue. The whole index
consists of Greek (funny characters) and solr returns them normally. The
problem here is that I cannot concatenate Greek characters in
data-config.xml (hard-coded).



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Import-data-from-Mysql-concat-issues-tp4137814p4137939.html
Sent from the Solr - User mailing list archive at Nabble.com.


SolrCloud Nodes autoSoftCommit and (temporary) missing documents

2014-05-23 Thread Michael Tracey
Hey all,

I've got a number of nodes (Solr 4.4 Cloud) that I'm balancing with HaProxy for 
queries.  I'm indexing pretty much constantly, and have autoCommit and 
autoSoftCommit on for Near Realtime Searching.  All works nicely, except that 
occasionally the auto-commit cycles are far enough off that one node will 
return a document that another node doesn't.  I don't want to have to add 
something like this: timestamp:[* TO NOW-30MINUTE] to every query to make sure 
that all the nodes have the record.  Ideas? autoSoftCommit more often?

autoCommit 
   maxDocs10/maxDocs 
   maxTime720/maxTime 
   openSearcherfalse/openSearcher 
/autoCommit

autoSoftCommit 
   maxTime3/maxTime 
   maxDocs5000/maxDocs
/autoSoftCommit 

Thanks,

M.


fw: (Issue) How improve solr facet performance

2014-05-23 Thread Alice.H.Yang (mis.cnsh04.Newegg) 41493
Hi, Solr Developer

  Thanks very much for your timely reply.

1.  I'm sorry, I have made a mistake, the total number of documents is 32 
Million, not 320 Million.
2.  The system memory is large for solr index, OS total has 256G, I set the 
solr tomcat HEAPSIZE=-Xms25G -Xmx100G

-How many fields are you faceting on?

Reply:  9 fields I facet on.

- How many unique values does your facet fields have (approximately)?

Reply:  3 facet fields have one hundred unique values, other 6 facet fields' 
unique values are between 3 to 15. 


- What is the content of your facets (Strings, numbers?)

Reply:  9 fields are all numbers.

- Which facet.method do you use?

Reply:  Used the default facet.method=fc

And we test this scenario:  If the number of facet fields' unique values is 
less we add facet.method=enum, there is a little to improve performance.

- What is the response time with faceting and a few thousand hits?

Reply:   result name=response numFound=2925 start=0  
   QTime is  int name=QTime6/int 


Best Regards,
Alice Yang
+86-021-51530666*41493
Floor 19,KaiKai Plaza,888,Wanhandu Rd,Shanghai(200042)

-Original Message-
From: Toke Eskildsen [mailto:t...@statsbiblioteket.dk]
Sent: Friday, May 23, 2014 8:08 PM
To: d...@lucene.apache.org
Subject: Re: (Issue) How improve solr facet performance

On Fri, 2014-05-23 at 11:45 +0200, Alice.H.Yang (mis.cnsh04.Newegg)
41493 wrote:
We are blocked by solr facet performance when query hits many 
 documents. (about 10,000,000)

[320M documents, immediate response for plain search with 1M hits]

 But when we add several facet.field to do facet ,QTime  increaseto 
 220ms or more.

It is not clear whether your observation of increased response time is due to 
many hits or faceting in itself.

- How many fields are you faceting on?
- How many unique values does your facet fields have (approximately)?
- What is the content of your facets (Strings, numbers?)
- Which facet.method do you use?
- What is the response time with faceting and a few thousand hits?

 Do you have some advice on how improve the facet performance when hit 
 many documents.

That depends on whether your bottleneck is the hitcount itself, the number of 
unique facet values or something third like I/O.


- Toke Eskildsen, State and University Library, Denmark



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional 
commands, e-mail: dev-h...@lucene.apache.org



Re: java.io.EOFException: seek past EOF

2014-05-23 Thread aarthi
We are using Solr version 4.4



--
View this message in context: 
http://lucene.472066.n3.nabble.com/java-io-EOFException-seek-past-EOF-tp4137817p4137959.html
Sent from the Solr - User mailing list archive at Nabble.com.