RE: Any option to NOT return stack trace in Solr response?

2016-07-22 Thread Koorosh Vakhshoori
Hi Alex,
  Thanks for confirming my finding.

  When it comes to Solr interfacing to a client, I agree completely.  However, 
I was hoping to limit the noise at Solr and not have to add extra code to 
filter out the exceptions. Just wondering, wouldn't it be a cleaner RESTFUL 
interface if instead of reporting the stack trace in response, Solr would 
return an error code and a basic message pointing back to Solr log for details 
such as stack trace. I am curious, what use case would it serve where one would 
require the stack trace in response?

  If there is interest, I could open an JIRA and come up with a patch.

Regards,

Koorosh

-Original Message-
From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] 
Sent: Thursday, July 21, 2016 6:54 PM
To: solr-user 
Subject: Re: Any option to NOT return stack trace in Solr response?

I don't think there is a flag.

But the bigger question is whether you are exposing Solr directly to the 
client? You should not be. You should have a middleware client that talks to 
Solr and then generates web UI or whatever.

If you give untrusted access to Solr, there are too many things that can be 
done, starting from deleting the whole index.

It might be possible to have a smart proxy and expose Solr with heavily 
filtered valid URLs, then you would need to scrub response.

That's all I can think of without hacking and reregistering with your own 
response handler (probably not that hard).

Regards,
Alex.

Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 22 July 2016 at 03:35, Koorosh Vakhshoori  
wrote:
> Hi all,
>   Got a Solr 5.2.1 installation. I am getting following error response when 
> calling the TERMS component. Now the error is not the point, I know what is 
> going on in this instance. However, to address security concerns, I am trying 
> to have Solr truncate the stack trace in the response. Of course I would 
> still want Solr to log the error in its log file. What I was wondering, if 
> there is a flag or option I can set in solrconfig.xml globally or under TERMS 
> to omit the trace or just return ' java.lang.NullPointerException'? I have 
> looked at the source code and don't see anything relevant. However, I may 
> have missed something. Appreciated any suggestion and pointers.
>
> 
> 
> 500
> 5
> 
> 
> 
> java.lang.NullPointerException at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(Sear
> chHandler.java:322) at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandle
> rBase.java:143) at 
> org.apache.solr.core.SolrCore.execute(SolrCore.java:2067) at 
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654) at 
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:450) at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter
> .java:227) at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter
> .java:196) at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appli
> cationFilterChain.java:239) at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFi
> lterChain.java:206) at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appli
> cationFilterChain.java:239) at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFi
> lterChain.java:206) at 
> org.apache.catalina.filters.CorsFilter.handleNonCORS(CorsFilter.java:4
> 39) at 
> org.apache.catalina.filters.CorsFilter.doFilter(CorsFilter.java:178) 
> at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appli
> cationFilterChain.java:239) at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFi
> lterChain.java:206) at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperVa
> lve.java:219) at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:106)
>  at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:136) 
> at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79) 
> at 
> org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:610)
>  at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88)
>  at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:526) 
> at 
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1078)
>  at 
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:655)
>  at 
> org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:222)
>  at 
> org.apache.tomcat.util.net.NioEndpoint

Any option to NOT return stack trace in Solr response?

2016-07-21 Thread Koorosh Vakhshoori
Hi all,
  Got a Solr 5.2.1 installation. I am getting following error response when 
calling the TERMS component. Now the error is not the point, I know what is 
going on in this instance. However, to address security concerns, I am trying 
to have Solr truncate the stack trace in the response. Of course I would still 
want Solr to log the error in its log file. What I was wondering, if there is a 
flag or option I can set in solrconfig.xml globally or under TERMS to omit the 
trace or just return ' java.lang.NullPointerException'? I have looked at the 
source code and don't see anything relevant. However, I may have missed 
something. Appreciated any suggestion and pointers.



500
5



java.lang.NullPointerException at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:322)
 at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:2067) at 
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654) at 
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:450) at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:227)
 at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:196)
 at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
 at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
 at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at org.apache.catalina.filters.CorsFilter.handleNonCORS(CorsFilter.java:439) 
at org.apache.catalina.filters.CorsFilter.doFilter(CorsFilter.java:178) at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
 at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:219)
 at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:106)
 at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:136) 
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79) 
at 
org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:610)
 at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88)
 at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:526) 
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1078)
 at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:655)
 at 
org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:222)
 at 
org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1566)
 at 
org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1523)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at 
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
 at java.lang.Thread.run(Thread.java:745)

500



Regards,

Koorosh


RE: How for distributed search only log collective search response

2015-12-18 Thread Koorosh Vakhshoori
It turns out there is a better way to do this. It does not require code change 
in Solr, if you are using log4j. However, you need to migrate to log4j.xml file 
format. The solution is to use the filter feature. Here is what my console 
appender looks like with the filter:











Regards,

Koorosh




How for distributed search only log collective search response

2015-12-14 Thread Koorosh Vakhshoori
  In my use case, I have a number of shards where a query would run as 
distributed search.  I am not using Solr Cloud, I have just a Solr server. Now, 
when the search runs, I see one entry for each shard query as well as the 
finally collective search query response. As the results, I am ending up with a 
very noisy log. I don't care about individual shard queries, just the aggregate 
result. Is there a way to configure Solr so it would only log the final 
collective response? I believe this use case also applies to Solr Cloud.

  Looking at the Solr code, class SolrCore, I see the following lines is 
performing the logging:

if (rsp.getToLog().size() > 0) {
  if (requestLog.isInfoEnabled()) {
requestLog.info(rsp.getToLogAsString(logid));
  }

  I was thinking of adding a flag that filter the distributed logs by looking 
at the 'params' and check for 'isShard=true' and if present don't log it.

  Any suggestion or comment? Is this something people would be interested in?

Regards,

Koorosh



Custom JUnit tests based on SolrTestCaseJ4 fails intermittently.

2012-07-18 Thread Koorosh Vakhshoori
Hi,
  I am trying out the Solr Alpha release against some custom and Junit codes
I have written. I am seeing my custom JUnit tests failing once in a while.
The tests are based on Solr Junit test code where they are extending
SolrTestCaseJ4. My guess is the Randomized Testing coming across some issue
here. However not sure what the source of the problem is. I noticed the
value of 'codec' is null for failed cases, but I am setting the
luceneMatchVersion value in solrconfig.xml as bellow:
  
   
${tests.luceneMatchVersion:LUCENE_CURRENT}
 
  I am including the test outputs for both scenarios here.
  
  Any help or pointer appreciated.
  
  Thanks,
  
  Koorosh
  

Here is the output of Junit test which failes when running it from Eclipse:
  
NOTE: test params are: codec=null, sim=null, locale=null, timezone=(null)
NOTE: Windows 7 6.1 amd64/Sun Microsystems Inc. 1.6.0_21
(64-bit)/cpus=4,threads=1,free=59414480,total=63242240
NOTE: All tests run in this JVM: [TestDocsHandler]
Jul 18, 2012 3:55:25 PM com.carrotsearch.randomizedtesting.RandomizedRunner
runSuite
SEVERE: Panic: RunListener hook shouldn't throw exceptions.
java.lang.NullPointerException
at
org.apache.lucene.util.RunListenerPrintReproduceInfo.reportAdditionalFailureInfo(RunListenerPrintReproduceInfo.java:159)
at
org.apache.lucene.util.RunListenerPrintReproduceInfo.testRunFinished(RunListenerPrintReproduceInfo.java:104)
at
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:634)
at
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)

Here is the output for the same test where it is successful:

24 T11 oas.SolrTestCaseJ4.initCore initCore
Creating dataDir:
C:\Users\xuser\AppData\Local\Temp\solrtest-TestDocsHandler-1342651924084
43 T11 oasc.SolrResourceLoader.locateSolrHome JNDI not configured for solr
(NoInitialContextEx)
43 T11 oasc.SolrResourceLoader.locateSolrHome using system property
solr.solr.home: solr-gold/solr-extraction
45 T11 oasc.SolrResourceLoader. new SolrResourceLoader for deduced
Solr Home: 'solr-gold/solr-extraction\'
284 T11 oasc.SolrConfig. Using Lucene MatchVersion: LUCENE_40
429 T11 oasc.SolrConfig. Loaded SolrConfig: solrconfig-dow.xml
434 T11 oass.IndexSchema.readSchema Reading Solr Schema
443 T11 oass.IndexSchema.readSchema Schema name=SolvNet Common core
522 T11 oass.IndexSchema.readSchema default search field in schema is
indexed_content
524 T11 oass.IndexSchema.readSchema query parser default operator is AND
525 T11 oass.IndexSchema.readSchema unique key field: id
616 T11 oasc.SolrResourceLoader.locateSolrHome JNDI not configured for solr
(NoInitialContextEx)
617 T11 oasc.SolrResourceLoader.locateSolrHome using system property
solr.solr.home: solr-gold/solr-extraction
617 T11 oasc.SolrResourceLoader. new SolrResourceLoader for directory:
'solr-gold/solr-extraction\'
618 T11 oasc.CoreContainer. New CoreContainer 994682772
642 T11 oasc.SolrCore. [collection1] Opening new SolrCore at
solr-gold/solr-extraction\,
dataDir=C:\Users\koo\AppData\Local\Temp\solrtest-TestDocsHandler-1342651924084\
642 T11 oasc.SolrCore. JMX monitoring not detected for core:
collection1
648 T11 oasc.SolrCore.getNewIndexDir WARNING New index directory detected:
old=null
new=C:\Users\koo\AppData\Local\Temp\solrtest-TestDocsHandler-1342651924084\index/
648 T11 oasc.SolrCore.initIndex WARNING [collection1] Solr index directory
'C:\Users\koo\AppData\Local\Temp\solrtest-TestDocsHandler-1342651924084\index'
doesn't exist. Creating new index...
742 T11 oasc.SolrDeletionPolicy.onCommit SolrDeletionPolicy.onCommit:
commits:num=1

commit{dir=MockDirWrapper(org.apache.lucene.store.RAMDirectory@44023756
lockFactory=org.apache.lucene.store.NativeFSLockFactory@21ed5459),segFN=segments_1,generation=1,filenames=[segments_1]
743 T11 oasc.SolrDeletionPolicy.updateCommits newest commit = 1
871 T11 oasc.RequestHandlers.initHandlersFromConfig created /update/javabin:
solr.BinaryUpdateRequestHandler
875 T11 oasc.RequestHandlers.initHandlersFromConfig created standard:
solr.StandardRequestHandler
878 T11 oasc.RequestHandlers.initHandlersFromConfig created /update:
solr.XmlUpdateRequestHandler
878 T11 oasc.RequestHandlers.initHandlersFromConfig created /admin/:
org.apache.solr.handler.admin.AdminHandlers
886 T11 oasc.RequestHandlers.initHandlersFromConfig created /update/extract:
com.synopsys.ies.solr.backend.handler.extraction.SolvNetExtractingRequestHandler
891 T11 oasc.RequestHandlers.initHandlersFromConfig WARNING Multiple
requestHandler registered to the same name: standard ignoring:
org.apache.solr.handler.StandardRequestHandler
892 T11 oasc.RequestHandlers.initHandlersFromConfig created standard:
solr.SearchHandler
892 T11 oasc.RequestHandlers.initHandlersFromConfig created employee:
solr.SearchHandler
892 T11 oasc.RequestHandlers.initHandlersFromConf

Solr 4.0 ALPHA: AbstractSolrTestCase depending on LuceneTestCase

2012-07-17 Thread Koorosh Vakhshoori
Hi,
  I have been developing extensions to SOLR code using 4.0 truck. For JUnit
testing I am extending AbstractSolrTestCase which in the ALPHA release is
located in JAR apache-solr-test-framework-4.0.0-ALPHA.jar. However, this
class extends LuceneTestCase which comes from JAR
lucene-test-framework-4.0-SNAPSHOT.jar. In the ALPHA release the later JAR
is not shipped or I can't find it. My question is which class should I use
for testing customized/extensions to SOLR/LUCENE code? Is there a better way
of doing this without build the lucene-test-framework-4.0-SNAPSHOT.jar from
the source code?

Thanks,

Koorosh


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-0-ALPHA-AbstractSolrTestCase-depending-on-LuceneTestCase-tp3995639.html
Sent from the Solr - User mailing list archive at Nabble.com.


ContentStreamUpdateRequest method addFile in 4.0 release.

2012-06-07 Thread Koorosh Vakhshoori
In latest 4.0 release, the addFile() method has a new argument 'contentType':

addFile(File file, String contentType)

In context of Solr Cell how should addFile() method be called? Specifically
I refer to the Wiki example:

ContentStreamUpdateRequest up = new
ContentStreamUpdateRequest("/update/extract");
up.addFile(new File("mailing_lists.pdf"));
up.setParam("literal.id", "mailing_lists.pdf");
up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
result = server.request(up);
assertNotNull("Couldn't upload mailing_lists.pdf", result);
rsp = server.query( new SolrQuery( "*:*") );
Assert.assertEquals( 1, rsp.getResults().getNumFound() );

given at URL: http://wiki.apache.org/solr/ExtractingRequestHandler

Since Solr Cell is calling Tika under the hood, doesn't the file
content-type is already identified by Tika? Looking at the code, it seems
passing NULL would do the job, is that correct? Also for Solr Cell, is the
ContentStreamUpdateRequest class is the right one to use or there is a
different class that is more appropriate here?

Thanks
 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/ContentStreamUpdateRequest-method-addFile-in-4-0-release-tp3988344.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Display of highlighted search result should start with the beginning of the sentence that contains the search string.

2012-03-12 Thread Koorosh Vakhshoori
Hi Koji,
  I am Shyam's coworker. After some looking into this issue, I believe the
problem of chopped word has to do with
org.apache.lucene.search.vectorhighlight.SimpleFragListBuilder class'
'margin' field. It is set to 6 by default. My understanding is having margin
value of greater than zero results in truncated word when the highlighted
term is too close to beginning of a document. I was able to reset the
'margin' field by creating my custom version of
org.apache.solr.highlight.SimpleFragListBuilder and passing zero for
'margin' when calling the Lucene's SimpleFragListBuilder constructor. My
testing shows the problem has been fixed. Do you concur?

  Now couple of questions. Not sure what the purpose of this field is, could
you give the use case for it? Also could it be exposed as a parameter in
Solr so it could be set to some other value?

Thanks,

Koorosh


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Display-of-highlighted-search-result-should-start-with-the-beginning-of-the-sentence-that-contains-t-tp3722912p3820516.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Indexing leave behind write.lock file.

2012-01-31 Thread Koorosh Vakhshoori
Here is how I got SolrJ to delete the write.lock file. I switched to the
CoreContainer's remove() method. So the new code is:

...
SolrCore curCore = container.remove("core1");
curCore.close();

Now my understanding for why it is working. Based on Solr source code, the
issue had to do with the core's reference count not ending up at zero when
the close() method is called. The getCore() method increments the reference
count while remove() doesn't. Since the close() method decrements the count
first and if and only if the count is zero it would unlock the core, i.e.
remove the write.lock.

Regards,

Koorosh 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-leave-behind-write-lock-file-tp3701915p3705554.html
Sent from the Solr - User mailing list archive at Nabble.com.


Indexing leave behind write.lock file.

2012-01-30 Thread Koorosh Vakhshoori
Hi,
 I am using SolrJ to reindex a core in a multiCore setup. The general
flow of my program is as follows (pseudo code):

String solrHome = "/opt/solr/home";
File solrXml = new File( solrHome, "solr.xml" );
container = new CoreContainer();
container.load(solrHome, solrXml);
SolrServer solr = new EmbeddedSolrServer(container, "core1");
solr.deleteByQuery("*:*");
SolrInputDocument doc1 = new SolrInputDocument();
doc1.addField( "id", "id1", 1.0f );
doc1.addField( "name", "doc1", 1.0f );
Collection docs = new ArrayList();
docs.add( doc1 );
solr.commit();
SolrCore curCore = container.getCore("core1");
curCore.close();

I thought for sure by calling close(), I would also be releasing all
associated resources including the lock on the core that is
I would getting rid of the write.lock file.

I am using Solr 4.0 code from the development truck which is about a month old.

Any suggestion here appreciated.

Regards,

Koorosh


Question on XPATH use in Solr Cell.

2011-06-15 Thread Koorosh Vakhshoori
I am new to both Solr and Cell, so sorry if I am misusing some of the 
terminologies. So the problem I am trying to solve is to index a PDF document 
using Solr Cell where I want to exclude part of it via XPATH. I am using Solr 
release 3.1. When researching the user list, I came across one entry on this 
topic titled 'XPath query support in Solr Cell' which clarify one issue, but 
still I am having problem getting what I want.

Here is what I have done so far:

First, I started by executing the following 'CURL' command to see what I would 
get:

curl 
"http://localhost:8983/solr/docs/update/extract?literal.id=123&xpath=/xhtml:html/xhtml:body/xhtml:div/descendant:node()&extractOnly=true"
 -F "file=@/docs/test.pdf"

This worked fine. Next I tried getting the first DIV element by modifying the 
XPATH query as follows:

curl 
"http://localhost:8983/solr/docs/update/extract?literal.id=123&xpath=/xhtml:html/xhtml:body/xhtml:div\[1\]/descendant:node()&extractOnly=true"
 -F "file=@/docs/test.pdf"

Note, I am escaping the '[]', I even tried using their encoded values %5B and 
%5D. It ran, but it did not match anything. Here is was I got:



0627
fileVersion A-2007.
122009-08-12T17:07:27ZTest title.FrameMaker 7.1187
2009-08-12T17:07:27ZTest Documentapplication/octet-streamWed Aug 12 10:07:2
7 PDT 20091372769test.pdfAcrobat Di
stiller 7.0.5 (Windows)2007application/pdfTest


On a different track I explored what could be an XPATH expression for my 
purpose. Here I have something that should get me there most of the way:

//xhtml:body/xhtml:div\[not(contains(p,'EXCLUDE TEXT'))\]

I independently validated the XPATH expression at following URL:

http://www.whitebeam.org/library/guide/TechNotes/xpathtestbed.rhtm

As was suggested in previously mentioned posting.

Any suggestion and help is greatly appreciated.