Re: Solr query - response status

2016-07-22 Thread Shyam R
Thanks Shawn for your insight!

On Fri, Jul 22, 2016 at 6:32 PM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 7/22/2016 12:41 AM, Shyam R wrote:
> > I see that SOLR returns status value as 0 for successful searches
> > org.apache.solr.core.SolrCore; [users_shadow_shard1_replica1]
> > webapp=/solr path=/user/ping params={} status=0 QTime=0 I do see that
> > the status come's back as 400 whenever the search is invalid (
> > invoking query with parameters that are not available in the target
> > collection ) What are the legitimate values of status and reason for
> > choosing 0?
>
> Solr (Jetty, really) sends back "200" for the HTTP status code when the
> request status is zero.
>
> The reason Solr uses a status of zero internally has its origins in the
> way most operating systems deal with program exit codes.  Almost
> universally, when a program exits with an exit code of 0, it tells the
> operating system that the exit was normal, no errors.  Any positive
> number indicates some kind of error.  The reason this is not reversed is
> simple -- unlike HTTP, which has multiple codes meaning success,
> operating systems must handle many different error codes, but only one
> success code.  So the success code is assigned to the number that's
> inherently different from the rest -- zero.
>
> Internally, Solr doesn't necessarily know that the response is going to
> use HTTP, although that is the most common method.  In the mind of a
> typical open source developer, an exit status of ANY positive number
> means there was an error, including 200.  Once control is handed off to
> Jetty, then the zero success status is translated to the most-used
> success code for HTTP.
>
> Any number could *potentially* be valid for the status in Solr logs, but
> I've only ever seen zero, 40x, and 50x.  The 40x series means there was
> a problem detected in the request, 50x means an error happened inside
> Solr itself after the request was determined to be good.  The ping
> handler will return a 503 statusif the health check is put into a
> disabledstate.
>
> Thanks,
> Shawn
>
>


-- 
Ph: 9845704792


Solr query - response status

2016-07-22 Thread Shyam R
All,

I see that SOLR returns status value as 0 for successful searches

org.apache.solr.core.SolrCore; [users_shadow_shard1_replica1] webapp=/solr
path=/user/ping params={} status=0 QTime=0

I do see that the status come's back as 400 whenever the search is invalid
( invoking query with parameters that are not available in the target
collection )

What are the legitimate values of status and reason for choosing 0?


Thanks
Shyam
-- 
Ph: 9845704792


Re: Solr not working on new environment

2016-03-29 Thread Shyam R
Hi Jarus,

Have you tried stopping the solr process and restarting the cluster again?

Thanks
Shyam

On Tue, Mar 29, 2016 at 8:36 PM, Jarus Bosman  wrote:

> Hi,
>
> Introductions first (as I was taught): My name is Jarus Bosman, I am a
> software developer from South Africa, doing development in Java, PHP and
> Delphi. I have been programming for 19 years and find out more every day
> that I don't actually know anything about programming ;).
>
> My problem:
>
> We recently moved our environment to a new server. I've installed 5.5.0 on
> the new environment. When I want to start the server, I get the following:
>
> *Welcome to the SolrCloud example!*
>
> *Starting up 2 Solr nodes for your example SolrCloud cluster.*
>
> *Solr home directory /opt/solr-5.5.0/example/cloud/node1/solr already
> exists.*
> */opt/solr-5.5.0/example/cloud/node2 already exists.*
>
> *Starting up Solr on port 8983 using command:*
> */opt/solr-5.5.0/bin/solr start -cloud -p 8983 -s
> "/opt/solr-5.5.0/example/cloud/node1/solr"*
>
> *Waiting up to 30 seconds to see Solr running on port 8983 [/]  Still not
> seeing Solr listening on 8983 after 30 seconds!*
> *INFO  - 2016-03-29 14:22:14.356; [   ] org.eclipse.jetty.util.log.Log;
> Logging initialized @463ms*
> *INFO  - 2016-03-29 14:22:14.717; [   ] org.eclipse.jetty.server.Server;
> jetty-9.2.13.v20150730*
> *WARN  - 2016-03-29 14:22:14.752; [   ]
> org.eclipse.jetty.server.handler.RequestLogHandler; !RequestLog*
> *INFO  - 2016-03-29 14:22:14.757; [   ]
> org.eclipse.jetty.deploy.providers.ScanningAppProvider; Deployment monitor
> [file:/opt/solr-5.5.0/server/contexts/] at interval 0*
> *INFO  - 2016-03-29 14:22:15.768; [   ]
> org.eclipse.jetty.webapp.StandardDescriptorProcessor; NO JSP Support for
> /solr, did not find org.apache.jasper.servlet.JspServlet*
> *WARN  - 2016-03-29 14:22:15.790; [   ]
> org.eclipse.jetty.security.ConstraintSecurityHandler;
> ServletContext@o.e.j.w.WebAppContext
> @7a583307{/solr,file:/opt/solr-5.5.0/server/solr-webapp/webapp/,STARTING}{/opt/solr-5.5.0/server/solr-webapp/webapp}
> has uncovered http methods for path: /*
> *INFO  - 2016-03-29 14:22:15.809; [   ]
> org.apache.solr.servlet.SolrDispatchFilter; SolrDispatchFilter.init():
> WebAppClassLoader=1287618844@4cbf811c*
> *INFO  - 2016-03-29 14:22:15.848; [   ]
> org.apache.solr.core.SolrResourceLoader; JNDI not configured for solr
> (NoInitialContextEx)*
> *INFO  - 2016-03-29 14:22:15.849; [   ]
> org.apache.solr.core.SolrResourceLoader; using system property
> solr.solr.home: /opt/solr-5.5.0/example/cloud/node1/solr*
> *INFO  - 2016-03-29 14:22:15.850; [   ]
> org.apache.solr.core.SolrResourceLoader; new SolrResourceLoader for
> directory: '/opt/solr-5.5.0/example/cloud/node1/solr'*
> *INFO  - 2016-03-29 14:22:15.851; [   ]
> org.apache.solr.core.SolrResourceLoader; JNDI not configured for solr
> (NoInitialContextEx)*
> *INFO  - 2016-03-29 14:22:15.852; [   ]
> org.apache.solr.core.SolrResourceLoader; using system property
> solr.solr.home: /opt/solr-5.5.0/example/cloud/node1/solr*
> *INFO  - 2016-03-29 14:22:15.880; [   ] org.apache.solr.core.SolrXmlConfig;
> Loading container configuration from
> /opt/solr-5.5.0/example/cloud/node1/solr/solr.xml*
> *INFO  - 2016-03-29 14:22:16.051; [   ]
> org.apache.solr.core.CorePropertiesLocator; Config-defined core root
> directory: /opt/solr-5.5.0/example/cloud/node1/solr*
> *INFO  - 2016-03-29 14:22:16.104; [   ] org.apache.solr.core.CoreContainer;
> New CoreContainer 1211012646*
> *INFO  - 2016-03-29 14:22:16.104; [   ] org.apache.solr.core.CoreContainer;
> Loading cores into CoreContainer
> [instanceDir=/opt/solr-5.5.0/example/cloud/node1/solr]*
> *WARN  - 2016-03-29 14:22:16.109; [   ] org.apache.solr.core.CoreContainer;
> Couldn't add files from /opt/solr-5.5.0/example/cloud/node1/solr/lib to
> classpath: /opt/solr-5.5.0/example/cloud/node1/solr/lib*
> *INFO  - 2016-03-29 14:22:16.133; [   ]
> org.apache.solr.handler.component.HttpShardHandlerFactory; created with
> socketTimeout : 60,connTimeout : 6,maxConnectionsPerHost :
> 20,maxConnections : 1,corePoolSize : 0,maximumPoolSize :
> 2147483647,maxThreadIdleTime : 5,sizeOfQueue : -1,fairnessPolicy :
> false,useRetries : false,*
> *INFO  - 2016-03-29 14:22:16.584; [   ]
> org.apache.solr.update.UpdateShardHandler; Creating UpdateShardHandler HTTP
> client with params: socketTimeout=60=6=true*
> *INFO  - 2016-03-29 14:22:16.590; [   ] org.apache.solr.logging.LogWatcher;
> SLF4J impl is org.slf4j.impl.Log4jLoggerFactory*
> *INFO  - 2016-03-29 14:22:16.592; [   ] org.apache.solr.logging.LogWatcher;
> Registering Log Listener [Log4j (org.slf4j.impl.Log4jLoggerFactory)]*
> *INFO  - 2016-03-29 14:22:16.603; [   ]
> org.apache.solr.cloud.SolrZkServerProps; Reading configuration from:
> /opt/solr-5.5.0/example/cloud/node1/solr/zoo.cfg*
> *INFO  - 2016-03-29 14:22:16.605; [   ] org.apache.solr.cloud.SolrZkServer;
> STARTING EMBEDDED STANDALONE ZOOKEEPER SERVER at port 9983*

Re: schemaless vs schema based core

2016-01-22 Thread Shyam R
I think, schema-less mode might allocate double instead of float, long
instead of int to guard against overflow, which increases index size. Is my
assumption valid?

Thanks




On Thu, Jan 21, 2016 at 10:48 PM, Erick Erickson 
wrote:

> I guess it's all about whether schemaless really supports
> 1> all the docs you index.
> 2> all the use-cases for search.
> 3> the assumptions it makes scale to you needs.
>
> If you've established rigorous tests and schemaless does all of the
> above, I'm all for shortening the cycle by using schemaless.
>
> But if it's just being sloppy and "success" is "I managed to index 50
> docs and get some results back by searching", expect to find some
> "interesting" issues down the road.
>
> And finally, if it's "we use schemaless to quickly try things in the
> UI and for the _real_ prod environment we need to be more rigorous
> about the schema", well shortening development time is A Good Thing.
> Part of moving to prod could be taking the schema generated by
> schemaless and tweaking it for instance.
>
> Best,
> Erick
>
> On Thu, Jan 21, 2016 at 8:54 AM, Shawn Heisey  wrote:
> > On 1/21/2016 2:22 AM, Prateek Jain J wrote:
> >> Thanks Erick,
> >>
> >>  Yes, I took same approach as suggested by you. The issue is some
> developers started with schemaless configuration and now they have started
> liking it and avoiding restrictions (including increased time to deploy
> application, in managed enterprise environment). I was more concerned about
> pushing best practices around this in team, because allowing anyone to new
> attributes will become overhead in terms of management, security and
> maintainability. Regarding your concern about not storing documents on
> separate disk; we are storing them in solr but not as backup copies. One
> doubt still remains in mind w.r.t auto-detection of types in  solr:
> >>
> >>  Is there a performance benefit of using defined types (schema based)
> vs un-defined types while adding documents? Does "solrj" ships this
> meta-information like type of attributes to solr, because code looks
> something like?
> >>
> >> SolrInputDocument doc = new SolrInputDocument();
> >>   doc.addField("category", "book"); // String
> >>   doc.addField("id", 1234); //Long
> >>   doc.addField("name", "Trying solrj"); //String
> >>
> >> In my opinion, any auto-detector code will have some overhead vs the
> other; any thoughts around this?
> >
> > Although the true reality may be more complex, you should consider that
> > everything Solr receives from SolrJ will be text -- as if you had sent
> > the JSON or XML indexing format manually, which has no type information.
> >
> > When you are building a document with SolrInputDocument, SolrJ has no
> > knowledge of the schema in Solr.  It doesn't know whether the target
> > field is numeric, string, date, or something else.
> >
> > Using different object types for input to SolrJ just gives you general
> > Java benefits -- things like detecting certain programming errors at
> > compile time.
> >
> > Thanks,
> > Shawn
> >
>



-- 
Ph: 9845704792


Re: Stable Versions in Solr 4

2015-12-30 Thread Shyam R
I will always look around here for versions / new functionality or fixes /
release notes

https://issues.apache.org/jira/browse/SOLR/?selectedTab=com.atlassian.jira.jira-projects-plugin:changelog-panel

Thanks

On Thu, Dec 31, 2015 at 4:05 AM, Shawn Heisey  wrote:

> On 12/28/2015 5:12 AM, abhi Abhishek wrote:
> >i am trying to determine stable version of SOLR 4. is there a blog
> which
> > we can refer.. i understand we can read through Release Notes. I am
> > interested in user reviews and challenges seen with various versions of
> > SOLR 4.
>
> Here's some information about Solr version numbers, with X.Y.Z providing
> the legend:  X is the major version number.  Major versions are released
> very infrequently.  Y tracks the minor version number.  Minor releases
> are made quite frequently.  Z is incremented with bugfix releases.  Most
> of the time, the third number in the version is zero.
>
> Every release of Solr that you can download from the official mirror
> network is built from a version control branch that is known as the
> stable branch.  Currently that is branch_5x, at some point in the future
> it will be branch_6x.
>
> The goal of the stable branch is to always be in a state where a viable
> release candidate could be created.  That's why it's called the stable
> branch.  If all of the tests in the included test suite are passing,
> that's a good sign that there are no major problems.  It's no guarantee,
> just a good sign.
>
> All releases have bugs, but unless those bugs are very nasty, they do
> not get fixed until the next minor version.  When the bugs are
> particularly bad, there might be a bugfix release.
>
> It sounds like you're trying to decide which release you should use.
> The answer to that question is usually very easy -- the latest version,
> which is currently 5.4.0.  Right after a new release happens, the best
> choice might be the newest bugfix release of the previous minor version.
>
> The pace of development is very high in Solr.  Each new minor version
> includes new features and enhancements.  The sum total of the
> differences between 4.0 and 4.10 is greater than the difference between
> 4.10 and 5.0.
>
> I would not recommend using a 4.x release at this time.  The 4.x line
> went into maintenance mode ten months ago with the release of 5.0.  The
> community is now focused on 5.x versions.  If you mention a problem with
> a 4.x version now, the first thing you'll be told is that you need to
> upgrade, because unless the bug you're experiencing is a showstopper
> that affects a wide variety of users, it will not be fixed in 4.x.  If
> it is a major bug that affects a large number of users, it will only be
> fixed a version like 4.10.5 -- a bugfix release on the last minor 4.x
> version.
>
> Thanks,
> Shawn
>



-- 
Ph: 9845704792


Re: Replacing a document in Solr5

2015-12-20 Thread Shyam R
+1

On Sun, Dec 20, 2015 at 10:28 PM, Debraj Manna 
wrote:

> Thanks Andrea for the detailed explanation.
> On Dec 19, 2015 1:34 PM, "Andrea Gazzarini"  wrote:
>
> > That has nothing to do with your topic: addField adds a new value for a
> > given field in a SolrInputDocument, while setField replaces any existing
> > value (of a given field, regardless what is the existing value, I mean,
> > regardless if that field has zero, one or more values).
> >
> > SolrInputDocument document = new SolrInputDocument();
> >
> > document.set("id", 32872382); // the id field has now one value:
> 32872382
> >
> > document.add("author", "B. Meyer") // the author field has one value. In
> > this case, being the first value, add() and set() behave in the the same
> > way
> >
> > document.add("author", "A. Yersu") // Now the author field has two values
> > document.set("author", "I.UUhash") // That will replace the existing two
> > values with this value.
> >
> >
> > solrClient.add(document); // here, You are sending  document with 1 id
> and
> > 1 author
> >
> >
> >
> > Those are methods of SolrInputDocument; when you call them, you're
> changing
> > the state of a local transfer object (the SolrInputDocument instance).
> > Before sending that to Solr using solrClient.add(SolrInputDocument) you
> can
> > do whatever you want with that instance (i.e. removing, adding, setting
> > values). The "document" representation that Solr will see is the state of
> > the instance that you pass to solrClient.add(...)
> >
> > Best,
> > Andrea
> >
> >
> > 2015-12-19 8:48 GMT+01:00 Debraj Manna :
> >
> > > Ok. Then what is the difference between addField
> > > <
> > >
> >
> http://github.com/apache/lucene-solr/tree/lucene_solr_5_3_1/solr/solrj/src/java/org/apache/solr/common/SolrInputDocument.java#L150
> > > >
> > > & setField
> > > <
> > >
> >
> http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/common/SolrInputDocument.html#setField-java.lang.String-java.lang.Object-float-
> > > >
> > > ?
> > >
> > > On Sat, Dec 19, 2015 at 1:04 PM, Andrea Gazzarini <
> a.gazzar...@gmail.com
> > >
> > > wrote:
> > >
> > > > As far as I know, this is how Solr works (e.g. it replaces the whole
> > > > document): how do you replace only a part of a document?
> > > >
> > > > Just send a SolrInputDocument with an existing (i.e. already indexed)
> > id
> > > > and the document (on Solr) will be replaced.
> > > >
> > > > Andrea
> > > >
> > > > 2015-12-19 8:16 GMT+01:00 Debraj Manna :
> > > >
> > > > > Can someone let me know how can I replace a document on each update
> > in
> > > > Solr
> > > > > 5.2.1 using SolrJ? I don;t want to update parts of the document. On
> > > doing
> > > > > update it should replace the entire document.
> > > > >
> > > >
> > >
> >
>



-- 
Ph: 9845704792


Re: Indexing PDF and MS Office files

2015-04-14 Thread Shyam R
Vijay,

You could try different excel files with different formats to rule out the
issue is with TIKA version being used.

Thanks
Murthy

On Wed, Apr 15, 2015 at 9:35 AM, Terry Rhodes trhodes...@gmail.com wrote:

 Perhaps the PDF is protected and the content can not be extracted?

 i have an unverified suspicion that the tika shipped with solr 4.10.2 may
 not support some/all office 2013 document formats.





 On 4/14/2015 8:18 PM, Jack Krupansky wrote:

 Try doing a manual extraction request directly to Solr (not via SolrJ) and
 use the extractOnly option to see if the content is actually extracted.

 See:
 https://cwiki.apache.org/confluence/display/solr/
 Uploading+Data+with+Solr+Cell+using+Apache+Tika

 Also, some PDF files actually have the content as a bitmap image, so no
 text is extracted.


 -- Jack Krupansky

 On Tue, Apr 14, 2015 at 10:57 AM, Vijaya Narayana Reddy Bhoomi Reddy 
 vijaya.bhoomire...@whishworks.com wrote:

  Hi,

 I am trying to index PDF and Microsoft Office files (.doc, .docx, .ppt,
 .pptx, .xlx, and .xlx) files into Solr. I am facing the following issues.
 Request to please let me know what is going wrong with the indexing
 process.

 I am using solr 4.10.2 and using the default example server configuration
 that comes with Solr distribution.

 PDF Files - Indexing as such works fine, but when I query using *.* in
 the
 Solr Query console, metadata information is displayed properly. However,
 the PDF content field is empty. This is happening for all PDF files I
 have
 tried. I have tried with some proprietary files, PDF eBooks etc. Whatever
 be the PDF file, content is not being displayed.

 MS Office files -  For some office files, everything works perfect and
 the
 extracted content is visible in the query console. However, for others, I
 see the below error message during the indexing process.

 *Exception in thread main
 org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
 org.apache.tika.exception.TikaException: Unexpected RuntimeException
 from
 org.apache.tika.parser.microsoft.OfficeParser*


 I am using SolrJ to index the documents and below is the code snippet
 related to indexing. Please let me know where the issue is occurring.

  static String solrServerURL = 
 http://localhost:8983/solr;;
 static SolrServer solrServer = new HttpSolrServer(solrServerURL);
  static ContentStreamUpdateRequest indexingReq =
 new

  ContentStreamUpdateRequest(/update/extract);

  indexingReq.addFile(file, fileType);
 indexingReq.setParam(literal.id, literalId);
 indexingReq.setParam(uprefix, attr_);
 indexingReq.setParam(fmap.content, content);
 indexingReq.setParam(literal.fileurl, fileURL);
 indexingReq.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
 solrServer.request(indexingReq);

 Thanks  Regards
 Vijay

 --
 The contents of this e-mail are confidential and for the exclusive use of
 the intended recipient. If you receive this e-mail in error please delete
 it from your system immediately and notify us either by e-mail or
 telephone. You should not copy, forward or otherwise disclose the content
 of the e-mail. The views expressed in this communication may not
 necessarily be the view held by WHISHWORKS.





-- 
Ph: 9845704792