from:"Furkan KAMACI"

Re: java.lang.StackOverflowError if pass long string in q parameter

2020-08-10 Thread Furkan KAMACI

Hi Kumar,

The problem you have here is StackOverflowError which is not related to the
character limit of the q parameter. First of all, consider using pagination
to fetch data from Solr. Secondly, share your configuration settings to
startup Solr and how much index you have to check whether you need a
tuning or not.

Kind Regards,
Furkan KAMACI

On Thu, Aug 6, 2020 at 8:46 AM kumar gaurav  wrote:

> HI
>
> I am getting the following exception if passing a long String in q
> parameter .
>
>
> q=uid:TCYY1EGPR38SX7EZ+OR+uid:TCYY1EGPR6M1ARAZ+OR+uid:TCYY1EGPR3NTTO3Z+OR+uid:TCYY1EGPR8L7XDZZ+OR+uid:TSHO3J0AGFUI9J3Z+OR+uid:TSHO3J0AI1CJJ2AZ+OR+uid:TSHO3J0AI4FZTBWZ+OR+uid:TDRE3J13G97WNCLZ+OR+uid:TCYY1EGPRA72BGHZ+OR+uid:TCYY1EGPR9EQUJYZ+OR+uid:TCYY1EGPRCTJXQPZ+OR+uid:TCYY1EGPR6RXPP0Z+OR+uid:TDRE3J13GBUSFV4Z+OR+uid:TTSH3FLDI7NJA8WZ+OR+uid:TERG3LIS70URWI5Z+OR+uid:TERG3LIS70QKOJAZ+OR+uid:TCYY1EGPR9EVMD5Z+OR+uid:TCYY1EGPRC8CRJ2Z+OR+uid:TCYY1EGPRGMD8MYZ+OR+uid:TCYY1EGPRM5OP68Z+OR+uid:TERG3LIS71AU8ZAZ+OR+uid:TERG3LIS719WRJWZ+OR+uid:THAQ3LIZCJ7TSEUZ+OR+uid:TERG3LIS70Q2O8IZ+OR+uid:TCYY1EGPRGXN2ZIZ+OR+uid:TCYY1EGPRGYTH3FZ+OR+uid:TCYY1EGPRK1JFUQZ+OR+uid:TCYY1EGPRM3JNN0Z+OR+uid:TERG3LIS70QPC4FZ+OR+uid:TBBA3LKKUOLVK89Z+OR+uid:TSOC1HULKNGBDUEZ+OR+uid:TSOC1HULKMTEOGTZ+OR+uid:TCYY1EGPRF93SE8Z+OR+uid:TCYY1EGPREUHNVMZ+OR+uid:TCYY1EGPRESMC0MZ+OR+uid:TCYY1EGPRDZE49OZ+OR+uid:THMB1OMS16B3OCPZ+OR+uid:TSOC1NS0MMMNAXOZ+OR+uid:TSOC1NS0GVJHP82Z+OR+uid:TSOC1NS0H3QAQQ7Z+OR+uid:TCYY2BESMSQWQBFZ+OR+uid:TCYY2BESMTJMA60Z+OR+uid:TCYY2BESN9EK5GFZ+OR+uid:TCYY2BESN9ER8PYZ+OR+uid:TSOC1NS0LBFBEAUZ+OR+uid:THAT2AOL6U500A1Z+OR+uid:THAT2AON5W2HVY9Z+OR+uid:THAT2AOL86LNHYTZ+OR+uid:TCYY2BESMO42C3GZ+OR+uid:TCYY1EGPSZSFLLTZ+OR+uid:TCYY1EGPT0X5B3DZ+OR+uid:THAT2AOL8GMD7O4Z+OR+uid:TSHT3FL6STFG1DEZ+OR+uid:TTSH3J0X6W92MPYZ+OR+uid:TTSH3J0X6SKNCECZ+OR+uid:TCYY1EGPS2J2UF4Z+OR+uid:TCYY1EGPT4HFILGZ+OR+uid:TCYY1EGPRQQQH7QZ+OR+uid:TCYY1EGPRZ72UA6Z+OR+uid:TSHT3FL6SWUTR9OZ+OR+uid:TTSH3J0X759RPQRZ+OR+uid:TTSH3J0X7ES5BR8Z+OR+uid:TTSH3J0X7CSXHAYZ+OR+uid:TCYY1EGPT74CXJMZ+OR+uid:TCYY1EGPS00631RZ+OR+uid:TCYY1EGPS0YU45YZ+OR+uid:TCYY1EGPS4BXXEFZ+OR+uid:TTSH3J0X7HFX0XMZ+OR+uid:TTSH3J0X1AY49RBZ+OR+uid:TTSH3J0X1B36WWWZ+OR+uid:TTSH3J0X1IOH3I8Z+OR+uid:TCYY1EGPSFA5BV2Z+OR+uid:TCYY1EGPSJ43BQNZ+OR+uid:TDAASAPEOHUVZZ+OR+uid:TCYY1EGPSPUZD2PZ+OR+uid:TTSH3J0X3B4S8E9Z+OR+uid:TTSH3J0X6O6TKRQZ+OR+uid:TBRF3LJHIFUI9G6Z+OR+uid:TTSH3J0X4O4S6AUZ+OR+uid:TCYY1EGPSPJHP2NZ+OR+uid:TCYY1EGPSQ95JCCZ+OR+uid:TCYY1EGPSSFR7Z0Z+OR+uid:TCYY1EGPSUYSCNKZ+OR+uid:TTSH3J0X65JG54CZ+OR+uid:TTSH3J0X6CS2ZAXZ+OR+uid:TTSH3J0X6HX537OZ+OR+uid:TTSH3J0X6PP1YGSZ+OR+uid:TCYY1EGPSWN05FGZ+OR+uid:TCYY1EGPSYB513WZ+OR+uid:TCYY1EGPSZR3X2SZ+OR+uid:TCYY1EGPT21MLB5Z+OR+uid:TBRF3LJHIFUOGPPZ+OR+uid:TTSH3J0X1TT376ZZ+OR+uid:TTSH3J0X4HE2ERLZ+OR+uid:TTSH3J0X39NEGZYZ+OR+uid:TCYY1EGPT4ZMPX4Z+OR+uid:TCCHSB60XT4YLZ+OR+uid:TCCHSB61WL7AZZ+OR+uid:TCYYSAUS1XIV3Z+OR+uid:TTSH3J0X6KMH7M2Z+OR+uid:TTSH3J0X1I5FYDGZ+OR+uid:TTSH3J0X4MISXH4Z+OR+uid:TCCHSB60XMUV1Z+OR+uid:TCCHSB61HK0B7Z+OR+uid:TCCHSB61VT84HZ+OR+uid:TCCHSB61ECHWDZ+OR+uid:TTSH3J0X1DU668XZ+OR+uid:TTSH3J0X1QGEU28Z+OR+uid:TTSH3J0X4BCEM0UZ+OR+uid:TTSH3J0X4MLHNIMZ+OR+uid:TCCHSB61E6Y87Z+OR+uid:TCYDSA2IT31VEZ+OR+uid:TCYDSA2IVH6HBZ+OR+uid:TDAASAPG0ADD5Z+OR+uid:TTSH3J0X4SZZWY7Z+OR+uid:TTSH3J0X36NM6Y7Z+OR+uid:TDAASAPFOY8EKZ+OR+uid:TDAASAPMOVIV5Z+OR+uid:TDAASAPI7JPNUZ+OR+uid:TDAASAPHV0UKXZ+OR+uid:TDAASAPFNE1HLZ+OR+uid:TDAASAPLVL68OZ+OR+uid:TDAASAPMLS2YXZ+OR+uid:TMOHS9OKD987QZ+OR+uid:TKKT1AL3XKSUWK4Z+OR+uid:TDAASAPEK2QUWZ+OR+uid:TDAASAPEL75NWZ+OR+uid:TDAASAPF8SZJSZ+OR+uid:TDAASAPBBDB7LZ+OR+uid:TKKT1AL3YGBBT6WZ+OR+uid:TKKT1AL3ZC37N63Z+OR+uid:TERG1F6W6ULALO6Z+OR+uid:TERG1F6W6V16EJOZ+OR+uid:TDAASAPF2MO5CZ+OR+uid:TCYY2BG5PF8FQEYZ+OR+uid:TCYY2BG5QA8QLLMZ+OR+uid:TCYY2BG5R1YBCSAZ+OR+uid:TERG1F6W6V1P63TZ+OR+uid:TERG1F6W6VDJOHPZ+OR+uid:TERG1F6W6VP70CYZ+OR+uid:TERG1F6W6WX5D8KZ+OR+uid:TCYY2BG5REQ3IJHZ+OR+uid:TCYY2BG5RRFVUGDZ+OR+uid:TDAASAPDZ31GKZ+OR+uid:TDAASAPH1HNF1Z+OR+uid:TERG1F6W6XQ5UWYZ+OR+uid:TERG1F6W71MQDF5Z+OR+uid:TERG1F6W736SVVJZ+OR+uid:TRNG1F6W9NO8IW7Z+OR+uid:TDAASAPCC56WXZ+OR+uid:TDAASAPE9IZZ0Z+OR+uid:TDAASAPHVBD96Z+OR+uid:TDAASAPIDRAJ6Z+OR+uid:TRNG1F6W9NR1AJNZ+OR+uid:TRNG1F6W9O4U8GRZ+OR+uid:TRNG1F6W9OJE1CJZ+OR+uid:TRNG1F6W9PQJQGUZ+OR+uid:TDPYS9W9CZH9OZ+OR+uid:TSBNSB3TFCCEJZ+OR+uid:TMOUSB8BSI8VZZ+OR+uid:THDPVN9E24NKWZ+OR+uid:TRNG1F6W9PW0F9IZ+OR+uid:TRNGTB9IGWYVVZ+OR+uid:TWAT2FNTMY5NML5Z+OR+uid:TWAT2FNTMY5S3JAZ+OR+uid:THDPVN9DMRA5PZ+OR+uid:TKEISA5KVQB3SZ+OR+uid:TLTB2HENA46KWL8Z+OR+uid:TLTBSALTEXOSKZ+OR+uid:TWAT2FNTONX41SZZ+OR+uid:TWAT2FNTOODV5F8Z+OR+uid:TWAT2FNTOOLD7WKZ
>
> {
>   "error":{
> "msg":"java.lang.StackOverflowError",
> "trace":"java.lang.RuntimeException:
> java.lang.StackOverflowError\n\tat
> org.apache.solr.servlet.HttpSolrCall.sendError(HttpSolrCall.java:662)\n\tat
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:530)\n\tat
>
> org.apache.solr.servlet.SolrDispatchFilter.doF

Re: Solr Query

2020-07-07 Thread Furkan KAMACI

Hi Swetha,

Given URL is encoded. So, you can decode it before analyzing. Plus
character is used for whitespaces when you encode a URL and minus sign
represents a negative query in Solr.

Kind Regards,
Furkan KAMACI

On Tue, Jul 7, 2020 at 9:16 PM swetha vemula 
wrote:

> Hi,
>
> I have an URL and I want to break this down and run it in the admin console
> but I am not what is ++ and - represents in the query.
>
> select?q=(StartPublish%3a%5b*+TO+-12-31T23%3a59%3a59.999Z%5d++-Content%3a(Birthdays%5c%2fAnniversaries))++-FriendlyUrl%3a(*%2farchive%2f*))++((Title_NGram%3a(swetha))%5e500+OR+(MetaTitle_NGram%3a(swetha))%5e400+OR+(MetaKeywords_NGram%3a(swetha))%5e300+OR+(MetaDescription_NGram%3a(swetha))%5e200+OR+(Content_NGram%3a(swetha))%5e1))++(ACL%3a((Everyone)+OR+(MIDCO410%5c%5cMidco%5c-AllEmployees)+OR+(MIDCO410%5c%5cMidco%5c-DotNetDevelopers)+OR+(MIDCO410%5c%5cMidco%5c-WebAdmins)+OR+(MIDCO410%5c%5cMidco%5c-Source%5c-Admin)=0=1=xml=2.2
>
> Thank You,
> Swetha.
>

Re: Solr heap Old generation grows and it is not recovered by G1GC

2020-06-25 Thread Furkan KAMACI

Hi Reinaldo,

Which version of Solr do you use and could you share your cache settings?

On the other hand, did you check here:
https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems

Kind Regards,
Furkan KAMACI

On Thu, Jun 25, 2020 at 11:09 PM Odysci  wrote:

> Hi,
>
> I have a solrcloud setup with 12GB heap and I've been trying to optimize it
> to avoid OOM errors. My index has about 30million docs and about 80GB
> total, 2 shards, 2 replicas.
>
> In my testing setup I submit multiple queries to solr (same node),
> sequentially, and with no overlap between the documents returned in each
> query (so docs do not need to be kept in cache)
>
> When the queries return a smallish number of docs (say, below 1000), the
> heap behavior seems "normal". Monitoring the gc log I see that young
> generation grows then when GC kicks in, it goes considerably down. And the
> old generation grows just a bit.
>
> However, at some point i have a query that returns over 300K docs (for a
> total size of approximately 1GB). At this very point the OLD generation
> size grows (almost by 2GB), and it remains high for all remaining time.
> Even as new queries are executed, the OLD generation size does not go down,
> despite multiple GC calls done afterwards.
>
> Can anyone shed some light on this behavior?
>
> I'm using the following GC options:
> GC_TUNE=" \
>
> -XX:+UseG1GC \
>
> -XX:+PerfDisableSharedMem \
>
> -XX:+ParallelRefProcEnabled \
>
> -XX:G1HeapRegionSize=4m \
>
> -XX:MaxGCPauseMillis=250 \
>
> -XX:InitiatingHeapOccupancyPercent=75 \
>
> -XX:+UseLargePages \
>
> -XX:+AggressiveOpts \
>
> "
> Thanks
> Reinaldo
>

Re: Unsubscribe me

2020-06-20 Thread Furkan KAMACI

Hi Shashikant,

You can tell me if you need help.

By the way, one can use solr-user-ow...@lucene.apache.org for such kinds of
questions so not to disturb the user mailing list.

Kind Regards,
Furkan KAMACI

On Sat, Jun 20, 2020 at 12:53 PM Erick Erickson 
wrote:

> Follow the instructions here:
> http://lucene.apache.org/solr/community.html#mailing-lists-irc. You must
> use the _exact_ same e-mail as you used to subscribe.
>
> If the initial try doesn't work and following the suggestions at the
> "problems" link doesn't work for you, let us know. But note you need to
> show us the _entire_ return header to allow anyone to diagnose the problem.
>
> Best,
> Erick
>
>
> > On Jun 20, 2020, at 3:22 AM, Shashikant Vaishnav <
> shashikantvaish...@gmail.com> wrote:
> >
> > Unsubscribe please
>
>

Re: Solr Terms browsing in descending order

2020-06-04 Thread Furkan KAMACI

Hi Jigar,

Is that a numeric field or not? By the way, have you checked the terms.sort
parameter or json facet sort parameter?

Kind Regards,
Furkan KAMACI

On Mon, Jun 1, 2020 at 11:37 PM Jigar Gajjar
 wrote:

> Hello,
> is it possible to retrieve index terms in the descending  order using
> terms handler, right now we get all terms in ascending order.
> Thanks,Jigar Gajjar
>

Re: Alternate Fields for Unified Highlighter

2020-05-22 Thread Furkan KAMACI

Hi David,

Thanks for the response! I use Unified Highlighter combined with
maxAnalyzedChars to accomplish my needs.

I'll file an issue and PR for it!

Kind Regards,
Furkan KAMACI

On Fri, May 22, 2020 at 11:25 PM David Smiley  wrote:

> Feel free to file an issue; I know it's not supported.  I also don't think
> it's a big deal because you can just ask Solr to return the
> "alternateField", thus letting the client side choose when to use that.  I
> suppose it might be large, so I can imagine a concern there.  It'd be nice
> if Solr had a DocTransformer to accomplish that.
>
> I know it's been awhile; I'm curious how the UH has been working for you,
> assuming you are using it.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Sun, Jun 2, 2019 at 6:47 AM Furkan KAMACI 
> wrote:
>
> > Hi All,
> >
> > I want to switch to Unified Highlighter due to performance reasons for my
> > Solr 7.6 I was using these fields
> >
> > solrQuery.addHighlightField("content_*")
> > .set("f.content_en.hl.alternateField", "content")
> > .set("f.content_es.hl.alternateField", "content")
> > .set("hl.useFastVectorHighlighter", "true");
> > .set("hl.maxAlternateFieldLength", 300);
> >
> > As far as I see, there is no definitions for alternate fields for unified
> > highlighter. How can I configure such a configuration?
> >
> > Kind Regards,
> > Furkan KAMACI
> >
>

Re: Require java 8 upgrade

2020-05-21 Thread Furkan KAMACI

Hi Akhila,

Here is the related documentation:
https://lucene.apache.org/solr/5_3_1/SYSTEM_REQUIREMENTS.html which says:

"Apache Solr runs of Java 7 or greater, Java 8 is verified to be compatible
and may bring some performance improvements. When using Oracle Java 7 or
OpenJDK 7, be sure to not use the GA build 147 or update versions u40, u45
and u51! We recommend using u55 or later."

Kind Regards,
Furkan KAMACI

On Fri, May 22, 2020 at 4:26 AM Akhila John  wrote:

> Hi Team,
>
> We use solr 5.3.1 for sitecore 8.2.
> We require to upgrade Java version to 'Java 8 Update 251' and remove /
> Upgrade Wireshark to 3.2.3 in our application servers.
> Could you please advise if this would have any impact on the solr. Does
> solr 5.3.1 support Java 8.
>
> Thanks and regards,
>
> Akhila
>
> Bupa A email disclaimer: The information contained in this email and
> any attachments is confidential and may be subject to copyright or other
> intellectual property protection. If you are not the intended recipient,
> you are not authorized to use or disclose this information, and we request
> that you notify us by reply mail or telephone and delete the original
> message from your mail system.
>

Re: TimestampUpdateProcessorFactory updates the field even if the value if present

2020-05-21 Thread Furkan KAMACI

Hi,

Do you have an id field for your documents? On the other hand, does your
document count increases when you index it again?

Kind Regards,
Furkan KAMACI

On Fri, May 22, 2020 at 1:03 AM gnandre  wrote:

> Hi,
>
> I do not pass that field at all.
>
> Here is the document that I index again and again to test through Solr
> Admin UI.
> {
> asset_id:"x:1",
> title:"x"
> }
>
> On Thu, May 21, 2020 at 5:25 PM Furkan KAMACI 
> wrote:
>
> > Hi,
> >
> > How do you index that document? Do you index it with an empty
> > *index_time_stamp_create* field as the second time too?
> >
> > Kind Regards,
> > Furkan KAMACI
> >
> > On Fri, May 22, 2020 at 12:05 AM gnandre 
> wrote:
> >
> > > Hi,
> > >
> > > Following is the update request processor chain.
> > >
> > >  default="true"
> > >
> > > <
> > > processor class="solr.TimestampUpdateProcessorFactory">  > > "fieldName">index_time_stamp_create   class=
> > > "solr.LogUpdateProcessorFactory" />  > > "solr.RunUpdateProcessorFactory" /> 
> > >
> > > And, here is how the field is defined in schema.xml
> > >
> > >  stored=
> > > "true" />
> > >
> > > Every time I index the same document, above field changes its value
> with
> > > latest timestamp. According to TimestampUpdateProcessorFactory  javadoc
> > > page, if a document does not contain a value in the timestamp field, a
> > new
> > > Date will be generated and added as the value of that field. After the
> > > first indexing this document should always have a value, so why then it
> > > gets updated later?
> > >
> > > I am using Solr Admin UI's Documents tab to index the document for
> > testing.
> > > I am using Solr 6.3 in master-slave architecture mode.
> > >
> >
>

Re: TimestampUpdateProcessorFactory updates the field even if the value if present

2020-05-21 Thread Furkan KAMACI

Hi,

How do you index that document? Do you index it with an empty
*index_time_stamp_create* field as the second time too?

Kind Regards,
Furkan KAMACI

On Fri, May 22, 2020 at 12:05 AM gnandre  wrote:

> Hi,
>
> Following is the update request processor chain.
>
> 
> <
> processor class="solr.TimestampUpdateProcessorFactory">  "fieldName">index_time_stamp_create   "solr.LogUpdateProcessorFactory" />  "solr.RunUpdateProcessorFactory" /> 
>
> And, here is how the field is defined in schema.xml
>
>  "true" />
>
> Every time I index the same document, above field changes its value with
> latest timestamp. According to TimestampUpdateProcessorFactory  javadoc
> page, if a document does not contain a value in the timestamp field, a new
> Date will be generated and added as the value of that field. After the
> first indexing this document should always have a value, so why then it
> gets updated later?
>
> I am using Solr Admin UI's Documents tab to index the document for testing.
> I am using Solr 6.3 in master-slave architecture mode.
>

Re: Solrcloud Garbage Collection Suspension linked across nodes?

2020-05-13 Thread Furkan KAMACI

Hi John,

I've denied and dropped him from mail list.

Kind Regards,
Furkan KAMACI

On Wed, May 13, 2020 at 8:06 PM John Blythe  wrote:

> can we get this person blocked?
> --
> John Blythe
>
>
> On Wed, May 13, 2020 at 1:05 PM ART GALLERY  wrote:
>
> > check out the videos on this website TROO.TUBE don't be such a
> > sheep/zombie/loser/NPC. Much love!
> > https://troo.tube/videos/watch/aaa64864-52ee-4201-922f-41300032f219
> >
> > On Mon, May 4, 2020 at 5:43 PM Webster Homer
> >  wrote:
> > >
> > > My company has several Solrcloud environments. In our most active cloud
> > we are seeing outages that are related to GC pauses. We have about 10
> > collections of which 4 get a lot of traffic. The solrcloud consists of 4
> > nodes with 6 processors and 11Gb heap size (25Gb physical memory).
> > >
> > > I notice that the 4 nodes seem to do their garbage collection at almost
> > the same time. That seems strange to me. I would expect them to be more
> > staggered.
> > >
> > > This morning we had a GC pause that caused problems . During that time
> > our application service was reporting "No live SolrServers available to
> > handle this request"
> > >
> > > Between 3:55 and 3:56 AM all 4 nodes were having some amount of garbage
> > collection pauses, for 2 of the nodes it was minor, for one it was 50%.
> For
> > 3 nodes it lasted  until 3>57. However the node with the worst impact
> > didn't recover until 4am.
> > >
> > > How is it that all 4 nodes were in lock step doing GC? If they all are
> > doing GC at the same time it defeats the purpose of having redundant
> cloud
> > servers.
> > > We just this weekend switched to use G1GC from CMS
> > >
> > > At this point in time we also saw that traffic to solr was not well
> > distributed. The application calls solr using CloudSolrClient which I
> > thought did its own load balancing. We saw 10X more traffic going to one
> > solr node that all the others, the we saw it start hitting another node.
> > All solr queries come from our application.
> > >
> > > During this period of time I saw only 1 error message in the solr log:
> > > ERROR (zkConnectionManagerCallback-8-thread-1) [   ]
> > o.a.s.c.ZkController There was a problem finding the leader in
> > zk:org.apache.solr.common.SolrException: Could not get leader props
> > >
> > > We are currently using Solr 7.7.2
> > > GC Tuning
> > > GC_TUNE="-XX:NewRatio=3 \
> > > -XX:SurvivorRatio=4 \
> > > -XX:TargetSurvivorRatio=90 \
> > > -XX:MaxTenuringThreshold=8 \
> > > -XX:+UseG1GC \
> > > -XX:MaxGCPauseMillis=250 \
> > > -XX:+ParallelRefProcEnabled"
> > >
> > >
> > >
> > >
> > > This message and any attachment are confidential and may be privileged
> > or otherwise protected from disclosure. If you are not the intended
> > recipient, you must not copy this message or attachment or disclose the
> > contents to any other person. If you have received this transmission in
> > error, please notify the sender immediately and delete the message and
> any
> > attachment from your system. Merck KGaA, Darmstadt, Germany and any of
> its
> > subsidiaries do not accept liability for any omissions or errors in this
> > message which may arise as a result of E-Mail-transmission or for damages
> > resulting from any unauthorized changes of the content of this message
> and
> > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> > subsidiaries do not guarantee that this message is free of viruses and
> does
> > not accept liability for any damages caused by any virus transmitted
> > therewith.
> > >
> > >
> > >
> > > Click http://www.merckgroup.com/disclaimer to access the German,
> > French, Spanish and Portuguese versions of this disclaimer.
> >
>

Re: Portable Solr

2019-11-04 Thread Furkan KAMACI

Hi,

Are you looking for EmbeddedSolrServer [1]?:
<https://cwiki.apache.org/confluence/display/solr/EmbeddedSolr>

By the way, using Spring Boot with an embedded server is production ready
[2].

On the other hand, embedded Solr with embedded Zookeper etc. less flexible
and should be reserved for special circumstances.

Kind Regards,
Furkan KAMACI

[1] https://cwiki.apache.org/confluence/display/solr/EmbeddedSolr
[2]
https://www.reddit.com/r/java/comments/499227/is_the_tomcat_server_embedded_in_spring_boot/

On Mon, Nov 4, 2019 at 9:59 AM Jörn Franke  wrote:

> Yes, simply search the mailing list or the web for embedded Solr and you
> will find what you need. Nevertheless, running embedded is just for
> development (also in case of Spring and others). Avoid it for an end user
> facing server application.
>
> > Am 03.11.2019 um 17:02 schrieb Java Developer :
> >
> > Hi,
> >
> > Like portable embedded web server (Spring Boot or Takes Or Rapid) Takes (
> > https://github.com/yegor256/takes) or Undertow (http://undertow.io/) or
> > Rapidoid (https://www.rapidoid.org/)
> >
> > Do we have portable Solr server? Want to build an Web application with
> Solr
> > with portability?
> >
> > The user needs should have only javaRest is portable...
> >
> > Please advise.
> >
> > Thanks
>

Re: 8.2.0 getting warning - unable to load jetty, not starting JettyAdminServer

2019-08-20 Thread Furkan KAMACI

Hi Arnold,

Such errors may arise due to file permission issues. I can run latest
version without of Solr via docker image without any errors. Could you
write which steps do you follow to run Solr docker?

Kind Regards,
Furkan KAMACI

On Tue, Aug 20, 2019 at 1:25 AM Arnold Bronley 
wrote:

> Hi,
>
> I am getting following warning in Solr admin UI logs. I did not get this
> warning in Solr 8.1.1
> Please note that I am using Solr docker slim image from here -
> https://hub.docker.com/_/solr/
>
> Unable to load jetty, not starting JettyAdminServer
>

Re: Slow Indexing scaling issue

2019-08-19 Thread Furkan KAMACI

Hi Parmeshwor,

2 hours for 3 gb of data seems too slow. We scale up to PBs in such a way:

1) Ignore all commits from client
via IgnoreCommitOptimizeUpdateProcessorFactory
2) Heavy processes are done on external Tika server instead of Solr Cell
with embedded Tika feature.
3) Adjust autocommit, softcommit and shard size according to your needs.
4) Adjust JVM parameters.
5) Do not use swap if you can.

Kind Regards,
Furkan KAMACI

On Tue, Aug 13, 2019 at 8:37 PM Erick Erickson 
wrote:

> Here’s some sample SolrJ code using TIka outside of Solr’s Extracting
> Request Handler, along with some info about why loading Solr with the job
> of extracting text is not optimal speed wise:
>
> https://lucidworks.com/post/indexing-with-solrj/
>
> > On Aug 13, 2019, at 12:15 PM, Jan Høydahl  wrote:
> >
> > You May want to review
> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems#SolrPerformanceProblems-SlowIndexing
> for some hints.
> >
> > Make sure to index with multiple parallel threads. Also remember that
> using /extract on the solr side is resource intensive and may make your
> cluster slow and unstable. Better to use Tika or similar on the client side
> and send text docs to solr.
> >
> > Jan Høydahl
> >
> >> 13. aug. 2019 kl. 16:52 skrev Parmeshwor Thapa <
> thapa.parmesh...@gmail.com>:
> >>
> >> Hi,
> >>
> >> We are having some issue on scaling solr indexing. Looking for
> suggestion.
> >>
> >> Setup : We have two solr cloud (7.4) instances running in separate cloud
> >> VMs with an external zookeeper ensemble.
> >>
> >> We are sending async / non-blocking http request to index documents in
> solr.
> >> 2
> >>
> >> cloud VMs ( 4 core * 32 GB)
> >>
> >> 16 gb allocated for jvm
> >>
> >> We are sending all types to document to solr , which it would extract
> and
> >> index,  Using /update/extract request handler
> >>
> >> We have stopwords.txt and dictionary (7mb) for stemming.
> >>
> >>
> >>
> >> Issue : indexing speed is quite slow for us. It is taking around 2
> hours to
> >> index around 3 gb of data. 10,000 documents(PDF, xls, word, etc). We are
> >> planning to index approximately 10 tb of data.
> >>
> >> Below is the solr config setting and schema,
> >>
> >>
> >>
> >> 
> >>
> >>   
> >>
> >> 
> >>
> >> 
> >>
> >> 
> >>
> >>  >> languageSet="auto" ruleType="APPROX" concat="true"/>
> >>
> >>   
> >>
> >> 
> >>
> >> 
> >>
> >>   
> >>
> >>  >> tokenizerModel="en-token.bin" sentenceModel="en-sent.bin"/>
> >>
> >>   
> >>
> >>  >> posTaggerModel="en-pos-maxent.bin"/>
> >>
> >>  >> dictionary="en-lemmatizer-again.dict.txt"/>
> >>
> >>
> >>
> >> 
> >>
> >> 
> >>
> >> 
> >>
> >> 
> >>
> >> 
> >>
> >>
> >>
> >>  >> stored="false"/>
> >>
> >> 
> >>
> >>
> >>
> >>  >> indexed="true" stored="true"/>
> >>
> >>  >> required="true" stored="true"/>
> >>
> >>  >> indexed="true" stored="true"/>
> >>
> >>  >> indexed="true" stored="true"/>
> >>
> >>  >> stored="true"/>
> >>
> >>  >> indexed="true" stored="true"/>
> >>
> >>  >> indexed="true" stored="true" />
> >>
> >>  >> stored="true"/>
> >>
> >>  >> indexed="true" stored="true"/>
> >>
> >>  >> indexed="true" stored="true"/>
> >>
> >>  >> indexed="true" stored="false"/>
> >>
> >>  >> indexed="true" stored="false"/>
> >>
> >>  >> indexed="true" stored="true"/>
> >>
> >>  >> indexed="true" stored="true"/>
> >>
> >>  >> indexed="true" stored="true"/>
> >>
> >>
> >>
> >> 
> >>
> >> 
> >>
> >>
> >>
> >>  stored="false"
> >> docValues="false" />
> >>
> >>
> >>
> >> And below is the solrConfig,
> >>
> >>
> >>
> >> 
> >>
> >>  BEST_COMPRESSION
> >>
> >> 
> >>
> >>
> >>
> >>   
> >>
> >>   1000
> >>
> >>   60
> >>
> >>   false
> >>
> >>   
> >>
> >>
> >>
> >>   
> >>
> >> ${solr.autoSoftCommit.maxTime:-1}
> >>
> >>   
> >>
> >>
> >>
> >>  >>
> >> startup="lazy"
> >>
> >> class="solr.extraction.ExtractingRequestHandler" >
> >>
> >>   
> >>
> >> true
> >>
> >> ignored_
> >>
> >> content
> >>
> >>   
> >>
> >> 
> >>
> >> *Thanks,*
> >>
> >> *Parmeshwor Thapa*
>
>

Re: Solr 7.6.0: PingRequestHandler - Changing the default query (:)

2019-08-04 Thread Furkan KAMACI

Hi,

You can change invariants i.e. *qt* and *q* of a *PingRequestHandler*:


   
 /search
 some test query
   
 

Check documentation fore more info:
https://lucene.apache.org/solr/7_6_0//solr-core/org/apache/solr/handler/PingRequestHandler.html

Kind Regards,
Furkan KAMACI

On Sat, Aug 3, 2019 at 4:17 PM Erick Erickson 
wrote:

> You can also (I think) explicitly define the ping request handler in
> solrconfig.xml to do something else.
>
> > On Aug 2, 2019, at 9:50 AM, Jörn Franke  wrote:
> >
> > Not sure if this is possible, but why not create a query handler in Solr
> with any custom query and you use that as ping replacement ?
> >
> >> Am 02.08.2019 um 15:48 schrieb dinesh naik :
> >>
> >> Hi all,
> >> I have few clusters with huge data set and whenever a node goes down its
> >> not able to recover due to below reasons:
> >>
> >> 1. ping request handler is taking more than 10-15 seconds to respond.
> The
> >> ping requesthandler however, expects it will return in less than 1
> second
> >> and fails a requestrecovery if it is not responded to in this time.
> >> Therefore recoveries never would start.
> >>
> >> 2. soft commit is very low ie. 5 sec. This is a business requirement so
> >> not much can be done here.
> >>
> >> As the standard/default admin/ping request handler is using *:* queries
> ,
> >> the response time is much higher, and i am looking for an option to
> change
> >> the same so that the ping handler returns the results within few
> >> miliseconds.
> >>
> >> here is an example for standard query time:
> >>
> >> snip---
> >> curl "
> >>
> http://hostname:8983/solr/parts/select?indent=on=*:*=0=json=false=timing
> >> "
> >> {
> >> "responseHeader":{
> >>   "zkConnected":true,
> >>   "status":0,
> >>   "QTime":16620,
> >>   "params":{
> >> "q":"*:*",
> >> "distrib":"false",
> >> "debug":"timing",
> >> "indent":"on",
> >> "rows":"0",
> >> "wt":"json"}},
> >> "response":{"numFound":1329638799,"start":0,"docs":[]
> >> },
> >> "debug":{
> >>   "timing":{
> >> "time":16620.0,
> >> "prepare":{
> >>   "time":0.0,
> >>   "query":{
> >> "time":0.0},
> >>   "facet":{
> >> "time":0.0},
> >>   "facet_module":{
> >> "time":0.0},
> >>   "mlt":{
> >> "time":0.0},
> >>   "highlight":{
> >> "time":0.0},
> >>   "stats":{
> >> "time":0.0},
> >>   "expand":{
> >> "time":0.0},
> >>   "terms":{
> >> "time":0.0},
> >>   "block-expensive-queries":{
> >> "time":0.0},
> >>   "slow-query-logger":{
> >> "time":0.0},
> >>   "debug":{
> >> "time":0.0}},
> >> "process":{
> >>   "time":16619.0,
> >>   "query":{
> >> "time":16619.0},
> >>   "facet":{
> >> "time":0.0},
> >>   "facet_module":{
> >> "time":0.0},
> >>   "mlt":{
> >> "time":0.0},
> >>   "highlight":{
> >> "time":0.0},
> >>   "stats":{
> >> "time":0.0},
> >>   "expand":{
> >> "time":0.0},
> >>   "terms":{
> >> "time":0.0},
> >>   "block-expensive-queries":{
> >> "time":0.0},
> >>   "slow-query-logger":{
> >> "time":0.0},
> >>   "debug":{
> >> "time":0.0}
> >>
> >>
> >> snap
> >>
> >> can we use query: _root_:abc in the ping request handler ? Tried this
> query
> >> and its returning the results within few miliseconds and also the nodes
> are
> >> able to recover without any issue.
> >>
> >> we want to use _root_ field for querying as this field is available in
> all
> >> our clusters with below definition:
> >>  >> termOffsets="false" stored="false" termPayloads="false" termPositions=
> >> "false" docValues="false" termVectors="false"/>
> >> Could you please let me know if using _root_ for querying in
> >> pingRequestHandler will cause any problem?
> >>
> >>   >> name="invariants"> /select _root_:abc  
> >>
> >>
> >> --
> >> Best Regards,
> >> Dinesh Naik
>
>

Re: NRT for new items in index

2019-08-03 Thread Furkan KAMACI

Hi,

First of all, could you check here:
https://lucidworks.com/post/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
to
better understand hard commits, soft commits and transaction logs to
achieve NRT search.

Kind Regards,
Furkan KAMACI

On Wed, Jul 31, 2019 at 3:47 PM profiuser  wrote:

> Hi,
>
> we have something about 400 000 000 items in a solr collection.
> We have set up auto commit property for this collection to 15 minutes.
> Is a big collection and we using some caches etc. Therefore we have big
> autocommit value.
>
> This have disadvantage that we haven't NRT searches.
>
> We would like to have NRT at least for searching for the newly added items.
>
> We read about new functionality "Category routed alilases" in a solr
> version
> 8.1.
>
> And we got an idea, that we could add to our collection schema field for
> routing.
> And at the time of indexing we check if item is new and to routing field we
> set up value "new", or the item is older than some time period we set up
> value to "old".
> And we will have one category routed alias routedCollection, and there will
> be 2 collections old and new.
>
> If we index new item, router choose new collection and this item is
> inserted
> to it. After some period we reindex item and we decide that this item is
> old
> and to routing field we set up value "old". Router decide to update
> (insert)
> item to collection old. But we expect that solr automatically check
> uniqueness in all routed collections. And if solr found item in other
> collection, than will be automatically deleted. But not !!!
>
> Is this expected behaviour?
>
> Could be used this functionality for issue we have? Or could someone
> suggest
> another solution, which ensure that we have all new items ready for NRT
> searches?
>
> Thanks for your help
>
>
>
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>

Re: SolrCloud recommended I/O RAID level

2019-07-30 Thread Furkan KAMACI

Hi Adi,

RAID10 is good for satisfying both indexing and query, striping across
mirror sets. However, you lose half of your raw disk space, just like with
RAID1.

Here is a mail thread of mine which discusses RAID levels for Solr
specific:
https://lists.apache.org/thread.html/462d7467b2f2d064223eb46763a6a6e606ac670fe7f7b40858d97c0d@1366325333@%3Csolr-user.lucene.apache.org%3E

Kind Regards,
Furkan KAMACI

On Mon, Jul 29, 2019 at 10:25 PM Kaminski, Adi 
wrote:

> Hi,
> We are about to size large environment with 7 nodes/servers with
> replication factor 2 of SolrCloud cluster (using Solr 7.6).
>
> The system contains parent-child (nested documents) schema, and about to
> have 40M parent docs with 50-80 child docs each (in total 2-3.2B Solr docs).
>
> We have a use case that will require to update parent document fields
> triggered by an application flow (with re-indexing or atomic/partial update
> approach, that will probably require to upgrade to Solr 8.1.1 that supports
> this feature and contains some fixes in nested docs handling area).
>
> Since these updates might be quite heavy from IOPS perspective, we would
> like to make sure that the IO hardware and RAID configuration are optimized
> (r/w ratio of 50% read and 50% write, to allow balanced search and update
> flows).
>
> Can someone share similar scale/use- case/deployment RAID level
> configuration ?
> (I assume that RAID5&6 are not an option due to parity/dual parity heavy
> impact on write operations, so it leaves RAID 0, 1 or 10).
>
> Thanks in advance,
> Adi
>
>
>
>
> Sent from Workspace ONE Boxer
>
>
> This electronic message may contain proprietary and confidential
> information of Verint Systems Inc., its affiliates and/or subsidiaries. The
> information is intended to be for the use of the individual(s) or
> entity(ies) named above. If you are not the intended recipient (or
> authorized to receive this e-mail for the intended recipient), you may not
> use, copy, disclose or distribute to anyone this message or any information
> contained in this message. If you have received this electronic message in
> error, please notify us by replying to this e-mail.
>

Re: Basic Query Not Working - Please Help

2019-07-30 Thread Furkan KAMACI

Hi Vipul,

You are welcome!

Kind Regards,
Furkan KAMACI

On Fri, Jul 26, 2019 at 11:07 AM Vipul Bahuguna <
newthings4learn...@gmail.com> wrote:

> Hi Furkan -
>
> I realized that I was searching incorrectly.
> I later realized that if I need to search by specific field, I need to do
> as you suggested -
> q=appname:App1 .
>
> OR if need to simply search by App1, then I need to use  to
> index my field appname at the time of insertion so that it can be later
> search without specifying the fieldname.
>
> thanks for your response.
>
> On Tue, Jul 23, 2019 at 6:07 AM Furkan KAMACI 
> wrote:
>
> > Hi Vipul,
> >
> > Which query do you submit? Is that one:
> >
> > q=appname:App1
> >
> > Kind Regards,
> > Furkan KAMACI
> >
> > On Mon, Jul 22, 2019 at 10:52 AM Vipul Bahuguna <
> > newthings4learn...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I have installed SOLR 8.1.1.
> > > I am new and trying the very basics.
> > >
> > > I installed solr8.1.1 on Windows and I am using SOLR in standalone
> mode.
> > >
> > > Steps I followed -
> > >
> > > 1. created a core as follows:
> > > solr create_core -c dox
> > >
> > > 2. updated the managed_schema.xml file to add few specific fields
> > specific
> > > to my schema as belows:
> > >
> > >  stored="true"/>
> > >  stored="true"/>
> > >  > stored="true"/>
> > >  > > stored="true"/>
> > >
> > > 3. then i restarted SOLR
> > >
> > > 4. then i went to the Documents tab to enter my sample data for
> indexing,
> > > which looks like below:
> > > {
> > >
> > >   "id" : "1",
> > >   "prjname" : "Project1",
> > >   "apps" : [
> > > {
> > >   "appname" : "App1",
> > >   "topics" : [
> > > {
> > >   "topicname" : "topic1",
> > >   "links" : [
> > > "http://www.google.com;,
> > > "http://www.t6.com;
> > >   ]
> > > },
> > > {
> > >   "topicname" : "topic2",
> > >   "links" : [
> > > "http://www.java.com;,
> > > "http://www.rediff.com;
> > >   ]
> > > }
> > >   ]
> > > },
> > > {
> > >   "appname" : "App2",
> > >   "topics" : [
> > > {
> > >   "topicname" : "topic3",
> > >   "links" : [
> > > "http://www.t3.com;,
> > > "http://www.t4.com;
> > >   ]
> > > },
> > > {
> > >   "topicname" : "topic4",
> > >   "links" : [
> > > "http://www.rules.com;,
> > > "http://www.amazon.com;
> > >   ]
> > > }
> > >   ]
> > > }
> > >   ]
> > > }
> > >
> > > 5. Now when i go to Query tab and click Execute Search with *.*, it
> shows
> > > my recently added document as follows:
> > > {
> > > "responseHeader":{ "status":0, "QTime":0, "params":{ "q":"*:*", "_":
> > > "1563780352100"}}, "response":{"numFound":1,"start":0,"docs":[ {
> > "id":"1",
> > > "
> > > prjname":["Project1"], "apps":["{appname=App1,
> topics=[{topicname=topic1,
> > > links=[http://www.google.com, http://www.t6.com]}, {topicname=topic2,
> > > links=[http://www.java.com, http://www.rediff.com]}]};,
> "{appname=App2,
> > > topics=[{topicname=topic3, links=[http://www.t3.com, http://www.t4.com
> > ]},
> > > {topicname=topic4, links=[http://www.rules.com, http://www.amazon.com
> > > ]}]}"],
> > > "_version_":1639742305772503040}] }}
> > >
> > > 6. But now when I am trying to search based on field topicname or
> > prjname,
> > > it does not returns any document. Even if put anything in q like App1,
> > zero
> > > results are being returned.
> > >
> > >
> > > Can someone help me understanding what I might have done incorrectly?
> > > May be I defined my schema incorrectly.
> > >
> > > Thanks in advance
> > >
> >
>

Re: Problem with solr suggester in case of non-ASCII characters

2019-07-30 Thread Furkan KAMACI

Hi Roland,

Could you check Analysis tab (
https://lucene.apache.org/solr/guide/8_1/analysis-screen.html) and tell how
the term is analyzed for both query and index?

Kind Regards,
Furkan KAMACI

On Tue, Jul 30, 2019 at 4:50 PM Szűcs Roland 
wrote:

> Hi All,
>
> I have an author suggester (searchcomponent and the related request
> handler) defined in solrconfig:
> 
> >
> 
>   author
>   AnalyzingInfixLookupFactory
>   DocumentDictionaryFactory
>   BOOK_productAuthor
>   short_text_hu
>   suggester_infix_author
>   false
>   false
>   2
> 
> 
>
>  startup="lazy" >
> 
>   true
>   10
>   author
> 
> 
>   suggest
> 
> 
>
> Author field has just a minimal text processing in query and index time
> based on the following definition:
>  positionIncrementGap="100" multiValued="true">
> 
>   
>   
>ignoreCase="true"/>
>   
> 
> 
>   
>ignoreCase="true"/>
>   
> 
>   
>docValues="true"/>
>docValues="true" multiValued="true"/>
>positionIncrementGap="100">
> 
>   
>   
>ignoreCase="true"/>
>   
>   
> 
>   
>
> When I use qeries with only ASCII characters, the results are correct:
> "Al":{
> "term":"Alexandre Dumas", "weight":0, "payload":""}
>
> When I try it with Hungarian authorname with special character:
> "Jó":"author":{
> "Jó":{ "numFound":0, "suggestions":[]}}
>
> When I try it with three letters, it works again:
> "Józ":"author":{
> "Józ":{ "numFound":10, "suggestions":[{ "term":"Bajza József", "
> weight":0, "payload":""}, { "term":"Eötvös József", "weight":0, "
> payload":""}, { "term":"Eötvös József", "weight":0, "payload":""}, {
> "term":"Eötvös József", "weight":0, "payload":""}, {
> "term":"József
> Attila", "weight":0, "payload":""}..
>
> Any idea how can it happen that a longer string has more matches than a
> shorter one. It is inconsistent. What can I do to fix it as it would
> results poor customer experience.
> They would feel that sometimes they need 2 sometimes 3 characters to get
> suggestions.
>
> Thanks in advance,
> Roland
>

Re: More Highlighting details

2019-07-25 Thread Furkan KAMACI

Hi Govind,

Highlighting is the easiest way to detect it. You can find a similar
question at here:
https://stackoverflow.com/questions/9629147/how-to-return-column-that-matched-the-query-in-solr

Kind Regards,
Furkan KAMACI

On Wed, Jul 24, 2019 at 9:28 PM govind nitk  wrote:

> Hi Furkan KAMACI,
>
> Thanks for your thoughts on maxAnalyzedChars.
>
> So, how can we get whether its matched or not? Is there any way to get such
> data from extra payload in response from solr ?
>
> Thanks and regards
> Govind
>
> On Wed, Jul 24, 2019 at 8:43 PM Furkan KAMACI 
> wrote:
>
> > Hi Govind,
> >
> > Using *hl.tag.pre* and *hl.tag.post* may help you. However you should
> keep
> > in mind that even such term exists in desired field, highlighter can use
> > fallback field due to *hl.maxAnalyzedChars* parameter.
> >
> > Kind Regards,
> > Furkan KAMACI
> >
> > On Wed, Jul 24, 2019 at 8:24 AM govind nitk 
> wrote:
> >
> > > Hi all,
> > > How about using hl.tag pre and post. If these are present then there is
> > > actually field match otherwise its default summary ?
> > > Will it work or there are some cases where it will not ?
> > >
> > >
> > > Thanks in advance.
> > >
> > >
> > >
> > > On Tue, Jul 23, 2019 at 5:48 PM govind nitk 
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > How to get more details for highlighting ?
> > > >
> > > > I am using
> > > >
> > >
> >
> hl.method=unified&=title,url,paragraph=true=true
> > > >
> > > > So, if query words not matched, I am getting defaultSummary, which is
> > > > great. *Can we get more info saying whether it found matches or
> default
> > > > summary? How to get such information?*
> > > > Also is it good idea to use highlighting on urls ? Will urls get
> > > distorted
> > > > by any chance ?
> > > >
> > > >
> > > > Best Regards,
> > > > Govind
> > > >
> > > >
> > >
> >
>

Re: SOLR Atomic Update - String multiValued Field

2019-07-24 Thread Furkan KAMACI

Hi Doss,

What was existing value and what happens after you do atomic update?

Kind Regards,
Furkan KAMACI

On Wed, Jul 24, 2019 at 2:47 PM Doss  wrote:

> HI,
>
> I have a multiValued field of type String.
>
>  multiValued="true"/>
>
> I want to keep this list unique, so I am using atomic updates with
> "add-distinct"
>
> {"docid":123456,"namelist":{"add-distinct":["Adam","Jane"]}}
>
> but this is not maintaining the expected uniqueness, am I doing something
> wrong? Guide me please.
>
> Thanks,
> Doss.
>

Re: More Highlighting details

2019-07-24 Thread Furkan KAMACI

Hi Govind,

Using *hl.tag.pre* and *hl.tag.post* may help you. However you should keep
in mind that even such term exists in desired field, highlighter can use
fallback field due to *hl.maxAnalyzedChars* parameter.

Kind Regards,
Furkan KAMACI

On Wed, Jul 24, 2019 at 8:24 AM govind nitk  wrote:

> Hi all,
> How about using hl.tag pre and post. If these are present then there is
> actually field match otherwise its default summary ?
> Will it work or there are some cases where it will not ?
>
>
> Thanks in advance.
>
>
>
> On Tue, Jul 23, 2019 at 5:48 PM govind nitk  wrote:
>
> > Hi all,
> >
> > How to get more details for highlighting ?
> >
> > I am using
> >
> hl.method=unified&=title,url,paragraph=true=true
> >
> > So, if query words not matched, I am getting defaultSummary, which is
> > great. *Can we get more info saying whether it found matches or default
> > summary? How to get such information?*
> > Also is it good idea to use highlighting on urls ? Will urls get
> distorted
> > by any chance ?
> >
> >
> > Best Regards,
> > Govind
> >
> >
>

Re: Basic Query Not Working - Please Help

2019-07-22 Thread Furkan KAMACI

Hi Vipul,

Which query do you submit? Is that one:

q=appname:App1

Kind Regards,
Furkan KAMACI

On Mon, Jul 22, 2019 at 10:52 AM Vipul Bahuguna <
newthings4learn...@gmail.com> wrote:

> Hi,
>
> I have installed SOLR 8.1.1.
> I am new and trying the very basics.
>
> I installed solr8.1.1 on Windows and I am using SOLR in standalone mode.
>
> Steps I followed -
>
> 1. created a core as follows:
> solr create_core -c dox
>
> 2. updated the managed_schema.xml file to add few specific fields specific
> to my schema as belows:
>
> 
> 
> 
>  stored="true"/>
>
> 3. then i restarted SOLR
>
> 4. then i went to the Documents tab to enter my sample data for indexing,
> which looks like below:
> {
>
>   "id" : "1",
>   "prjname" : "Project1",
>   "apps" : [
> {
>   "appname" : "App1",
>   "topics" : [
> {
>   "topicname" : "topic1",
>   "links" : [
> "http://www.google.com;,
> "http://www.t6.com;
>   ]
> },
> {
>   "topicname" : "topic2",
>   "links" : [
> "http://www.java.com;,
> "http://www.rediff.com;
>   ]
> }
>   ]
> },
> {
>   "appname" : "App2",
>   "topics" : [
> {
>   "topicname" : "topic3",
>   "links" : [
> "http://www.t3.com;,
> "http://www.t4.com;
>   ]
> },
> {
>   "topicname" : "topic4",
>   "links" : [
> "http://www.rules.com;,
> "http://www.amazon.com;
>   ]
> }
>   ]
> }
>   ]
> }
>
> 5. Now when i go to Query tab and click Execute Search with *.*, it shows
> my recently added document as follows:
> {
> "responseHeader":{ "status":0, "QTime":0, "params":{ "q":"*:*", "_":
> "1563780352100"}}, "response":{"numFound":1,"start":0,"docs":[ { "id":"1",
> "
> prjname":["Project1"], "apps":["{appname=App1, topics=[{topicname=topic1,
> links=[http://www.google.com, http://www.t6.com]}, {topicname=topic2,
> links=[http://www.java.com, http://www.rediff.com]}]};, "{appname=App2,
> topics=[{topicname=topic3, links=[http://www.t3.com, http://www.t4.com]},
> {topicname=topic4, links=[http://www.rules.com, http://www.amazon.com
> ]}]}"],
> "_version_":1639742305772503040}] }}
>
> 6. But now when I am trying to search based on field topicname or prjname,
> it does not returns any document. Even if put anything in q like App1, zero
> results are being returned.
>
>
> Can someone help me understanding what I might have done incorrectly?
> May be I defined my schema incorrectly.
>
> Thanks in advance
>

Alternate Fields for Unified Highlighter

2019-06-02 Thread Furkan KAMACI

Hi All,

I want to switch to Unified Highlighter due to performance reasons for my
Solr 7.6 I was using these fields

solrQuery.addHighlightField("content_*")
.set("f.content_en.hl.alternateField", "content")
.set("f.content_es.hl.alternateField", "content")
.set("hl.useFastVectorHighlighter", "true");
.set("hl.maxAlternateFieldLength", 300);

As far as I see, there is no definitions for alternate fields for unified
highlighter. How can I configure such a configuration?

Kind Regards,
Furkan KAMACI

Solr URI Too Long

2019-05-05 Thread Furkan KAMACI

Hi,

I got a URI Too Long error and try to fix it. I'm aware of this
conversation:
http://lucene.472066.n3.nabble.com/URI-is-too-long-td4254270.html

I've tried:

Used POST instead of GET at SolrJ
Set 2147483647 at solrconfig.xml for
each cores.
Defined SOLR_OPTS="$SOLR_OPTS
-Dorg.eclipse.jetty.server.Request.maxFormContentSize=200" at solr.in.sh

I need to send a long query into Solr. I use Solr 7.6.0 and plan to use 8.1
whenever available.

Any ideas about how to overcome this?

Kind Regards,
Furkan KAMACI

JSON Facet Count All Information

2019-05-02 Thread Furkan KAMACI

Hi,

I have a multivalued field at which I store some metadata. I want to see
top 4 metadata at my documents and also total metadata count. I run that
query:

q=metadata:[*+TO+*]=0={top_tags:{type:terms,field:metadata,limit:4,mincount:1}}

However, how can I calculate total term count in a multivalued field beside
running a json facet on that?

Kind Regards,
Furkan KAMACI

Fetching All Terms and Corresponding Documents

2019-03-18 Thread Furkan KAMACI

Hi,

I need to iterate on all terms at Solr index, and then find related
documents for some terms that match my criteria.

I know that I can send a query to *LukeRequestHandler*:

*/admin/luke?fl=content={distinct term count}=json*

and then check my criteria. If matches, I can send a *fq* to retrieve
related docs.

However, is there any other efficient way (via rest or Solrj) for my case?

Kind Regards,
Furkan KAMACI

Re: No registered leader was found after waiting for 1000ms in solr

2019-03-05 Thread Furkan KAMACI

Hi Maimuna,

Could you check here:
https://stackoverflow.com/questions/47868737/solr-cloud-no-registered-leader-was-found-after-waiting-for-4000ms

Kind Regards,
Furkan KAMACI

On Wed, Mar 6, 2019 at 10:25 AM maimuna ambareen 
wrote:

> when i run the healthcheck command in solr :
> bin/solr healthcheck -c mypet -z x.x.x.x:2181 i am getting
> No registered leader was found after waiting for 1000ms
>
> However i am able to find other details and list of live nodes as well in
> the output.  Can someone explain me the reason behind this error ?
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>

Re: SolrCloud one server with high load

2019-03-04 Thread Furkan KAMACI

Hi Gaël,

Does all three servers have same specifications? On the other hand, is your
load balancing configuration for Varnish is round-robin?

Kind Regards,
Furkan KAMACI

On Mon, Mar 4, 2019 at 3:18 PM Gael Jourdan-Weil <
gael.jourdan-w...@kelkoogroup.com> wrote:

> Hello,
>
> I come again to the community for some ideas regarding a performance issue
> we are having.
>
> We have a SolrCloud cluster of 3 servers.
> Each server hosts 1 replica of 2 collections.
> There is no sharding, every server hosts the whole collection.
>
> Requests are evenly distributed by a Varnish system.
>
> During some peaks of requests, we see one server of the cluster having
> very high load while the two others are totally fine.
> The server experiencing this high load is always the same until we reboot
> it and the behavior moves to another server.
> The server experiencing the issue is not necessarily the leader.
> All servers receive the same number of requests per seconds.
>
> Load data:
> - Server1: 5% CPU when low QPS, 90% CPU when high QPS (this one having
> issues)
> - Server2: 5% CPU when low QPS, 25% CPU when high QPS
> - Server3: 5% CPU when low QPS, 20% CPU when high QPS
>
> What could explain this behavior in SolrCloud mechanisms?
>
> Thank you for reading,
>
> Gaël Jourdan-Weil
>

Re: Full import alternatives

2019-03-04 Thread Furkan KAMACI

Hi Sami,

Did you check delta import documentation:
https://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command

Kind Regards,
Furkan KAMACI

On Thu, Feb 28, 2019 at 7:24 PM sami  wrote:

> Hi Shawan, can you please suggest a small program or atleast a backbone of
> a
> program which can give me hints how exactly to achieve, I quote: "I send a
> full-import DIH command to all of the
> shards, and each one makes an SQL query to MySQL, all of them running in
> parallel. "
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>

Re: Code review for SOLR related changes.

2019-03-04 Thread Furkan KAMACI

Hi Fiz,

Could you elaborate your question?

Kind Regards,
Furkan KAMACI

On Fri, Mar 1, 2019 at 7:41 PM Fiz Ahmed  wrote:

> Hi Solr Experts,
>
> Can you please suggest Code review techniques for SOLR related changes in a
> Project.
>
>
> Thanks
> FIZ
> AML Team.
>

Re: Spring Boot Solr+ Kerberos+ Ambari

2019-02-21 Thread Furkan KAMACI

Hi,

You can also check here:
https://community.hortonworks.com/articles/15159/securing-solr-collections-with-ranger-kerberos.html
On
the other hand, we have a section for Solr Kerberos at documentation:
https://lucene.apache.org/solr/guide/6_6/kerberos-authentication-plugin.html
For
any Ambari specific questions, you can ask them at this forum:
https://community.hortonworks.com/topics/forum.html

Kind Regards,
Furkan KAMACI

On Thu, Feb 21, 2019 at 1:43 PM Rushikesh Garadade <
rushikeshgarad...@gmail.com> wrote:

> Hi Furkan,
> I think the link you provided is for ranger audit setting, please correct
> me if wrong?
>
> I use HDP 2.6.5. which has Solr 5.6
>
> Thank you,
> Rushikesh Garadade
>
>
> On Thu, Feb 21, 2019, 2:57 PM Furkan KAMACI 
> wrote:
>
> > Hi Rushikesh,
> >
> > Did you check here:
> >
> >
> https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_security/content/solr_ranger_configure_solrcloud_kerberos.html
> >
> > By the way, which versions do you use?
> >
> > Kind Regards,
> > Furkan KAMACI
> >
> > On Thu, Feb 21, 2019 at 11:41 AM Rushikesh Garadade <
> > rushikeshgarad...@gmail.com> wrote:
> >
> > > Hi All,
> > >
> > > I am trying to set Kerberos for Solr which is installed on Hortonworks
> > > Ambari.
> > >
> > > Q1. Is Ranger a mandatory component for Solr Kerberos configuration on
> > > ambari.?
> > >
> > > I am getting little confused with documents available on internet for
> > this.
> > > I tried to do without ranger but not getting any success.
> > >
> > > If is there any good document for the same, please let me know.
> > >
> > > Thanks,
> > > Rushikesh Garadade.
> > >
> >
>

Re: [lucene > nori ] special characters issue

2019-02-21 Thread Furkan KAMACI

Hi,

Could you give some more information about your configuration? Also, check
here for how to debug the reason:
https://lucene.apache.org/solr/guide/7_6/analysis-screen.html

Kind Regards,
Furkan KAMACI

On Tue, Feb 12, 2019 at 11:34 AM 유정인  wrote:

>
> Hi I'm using the "nori" analyzer.
>
> Whether it's an error or an intentional question.
>
> All special characters are filtered.
>
> Special characters stored in the dictionary are also filtered.
>
> How do I print special characters?
>
>

Re: Spring Boot Solr+ Kerberos+ Ambari

2019-02-21 Thread Furkan KAMACI

Hi Rushikesh,

Did you check here:
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_security/content/solr_ranger_configure_solrcloud_kerberos.html

By the way, which versions do you use?

Kind Regards,
Furkan KAMACI

On Thu, Feb 21, 2019 at 11:41 AM Rushikesh Garadade <
rushikeshgarad...@gmail.com> wrote:

> Hi All,
>
> I am trying to set Kerberos for Solr which is installed on Hortonworks
> Ambari.
>
> Q1. Is Ranger a mandatory component for Solr Kerberos configuration on
> ambari.?
>
> I am getting little confused with documents available on internet for this.
> I tried to do without ranger but not getting any success.
>
> If is there any good document for the same, please let me know.
>
> Thanks,
> Rushikesh Garadade.
>

Re: Is anyone using proxy caching in front of solr?

2019-02-20 Thread Furkan KAMACI

Hi Joakim,

I suggest you to read these resources:

http://lucene.472066.n3.nabble.com/Varnish-td4072057.html
http://lucene.472066.n3.nabble.com/SolrJ-HTTP-caching-td490063.html
https://wiki.apache.org/solr/SolrAndHTTPCaches

which gives information about HTTP Caching including Varnish Cache,
Last-Modified, ETag, Expires, Cache-Control headers.

Kind Regards,
Furkan KAMACI

On Wed, Feb 20, 2019 at 11:18 PM Joakim Hansson 
wrote:

> Hello dear user list!
> I work at a company in retail where we use solr to perform searches as you
> type.
> As soon as you type more than 1 characters in the search field solr starts
> serving hits.
> Of course this generates a lot of "unnecessary" queries (in the sense that
> they are never shown to the user) which is why I started thinking about
> using something like squid or varnish to cache a bunch of these 2-4
> character queries.
>
> It seems most stuff I find about it is from pretty old sources, but as far
> as I know solrcloud doesn't have distributed cache support.
>
> Our indexes aren't updated that frequently, about 4 - 6 times a day. We
> don't use a lot of shards and replicas (biggest index is split to 3 shards
> with 2 replicas). All shards/replicas are not on the same solr host.
> Our solr setup handles around 80-200 queries per second during the day with
> peaks at >1500 before holiday season and sales.
>
> I haven't really read up on the details yet but it seems like I could use
> etags and Expires headers to work around having to do some of that
> "unnecessary" work.
>
> Is anyone doing this? Why? Why not?
>
> - peace!
>

Re: English Analyzer

2019-02-06 Thread Furkan KAMACI

Hi,

As Walter suggested you can check it via analyses page. You can find more
information here:
https://lucene.apache.org/solr/guide/7_6/analysis-screen.html

Kind Regards,
Furkan KAMACI

On Tue, Feb 5, 2019 at 8:51 PM Walter Underwood 
wrote:

> Why?
>
> If you want to look at the results, install Solr, create a two fieldtypes
> in the schema with the two analyzers, then use the analysis page to try
> them.
>
> On the other hand, you could just use KStem. The Porter stemmers are
> ancient technology and have some well-known limitations.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Feb 5, 2019, at 9:13 AM, akash jayaweera 
> wrote:
> >
> > Thank you very much for the valuable information.
> > But I need to use  org.apache.lucene.analysis *API* for English and do
> the
> > analysis process.
> > Ex :- when I submit English document to the API, I want the Analyzed
> > Document. So I can see the difference made by the English analyzer.
> >
> > Regards,
> > *Akash Jayaweera.*
> >
> >
> > E akash.jayawe...@gmail.com 
> > M + 94 77 2472635 <+94%2077%20247%202635>
> >
> >
> > On Tue, Feb 5, 2019 at 5:54 PM Dave 
> wrote:
> >
> >> This will tell you pretty everything you need to get started
> >>
> >> https://lucene.apache.org/solr/guide/6_6/language-analysis.html
> >>
> >>> On Feb 5, 2019, at 4:55 AM, akash jayaweera  >
> >> wrote:
> >>>
> >>> Hello All,
> >>>
> >>> Can i get details how to use English analyzer with stemming,
> >>> lemmatizatiion, stopword removal techniques.
> >>> I want to see the difference between before and after applying the
> >> English
> >>> analyzer
> >>>
> >>> Regards,
> >>> *Akash Jayaweera.*
> >>>
> >>>
> >>> E akash.jayawe...@gmail.com 
> >>> M + 94 77 2472635 <+94%2077%20247%202635>
> >>
>
>

Java Advanced Imaging (JAI) Image I/O Tools are not installed

2018-11-05 Thread Furkan KAMACI

Hi All,

I use Solr 6.5.0 and test OCR capabilities. It OCRs pdf files even it is so
slow. However, I see that error when I check logs:

o.a.p.c.PDFStreamEngine Cannot read JPEG2000 image: Java Advanced Imaging
(JAI) Image I/O Tools are not installed

Any idea how to fix this?

Kind  Regards,
Furkan KAMACI

Rename of Category.QUERYHANDLER

2018-11-05 Thread Furkan KAMACI

Hi,

Solr 6.3.0 had SolrInfoMBean.Category.QUERYHANDLER. However, I cannot see
it at Solr 6.5.0.

What is the new name of that variable?

Kind Regards,
Furkan KAMACI

Solr OCR Support

2018-11-02 Thread Furkan KAMACI

Hi All,

I want to index images and pdf documents which have images into Solr. I
test it with my Solr 6.3.0.

I've installed tesseract at my computer (Mac). I verify that Tesseract
works fine to extract text from an image.

I index image into Solr but it has no content. However, as far as I know, I
don't need to do anything else to integrate Tesseract with Solr.

I've checked these but they were not useful for me:

http://lucene.472066.n3.nabble.com/TIKA-OCR-not-working-td4201834.html
http://lucene.472066.n3.nabble.com/Fwd-configuring-Solr-with-Tesseract-td4361908.html

My question is, how can I support OCR with Solr?

Kind Regards,
Furkan KAMACI

Re: Update Request Processors are Not Chained

2018-10-04 Thread Furkan KAMACI

I found the problem :) Problem is processor are not combined into one chain.

On Thu, Oct 4, 2018 at 3:57 PM Furkan KAMACI  wrote:

> I've defined my update processors as:
>
> 
> class="org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory">
>   
> content
> en,tr
> language_code
> other
> true
> true
>   
> 
>
>
>  
>
>  
>
>  true
>  signature
>  false
>  content
>  3
>   name="signatureClass">org.apache.solr.update.processor.TextProfileSignature
>
>
>
>  
>
>   default="true">
>
>  200
>
>
>
>
>  
>
> My /update/extract request handler is as follows:
>
>  startup="lazy"
> class="solr.extraction.ExtractingRequestHandler" >
>   
> true
> true
> ignored_
> content
> ignored_
> ignored_
>   
>   
> dedupe
> langid
> ignore-commit-from-client
>  
> 
>
> dedupe chain works nd signature field is populated but langid processor is
> not triggered at this combination. When I change their places:
>
>  startup="lazy"
> class="solr.extraction.ExtractingRequestHandler" >
>   
> true
> true
> ignored_
> content
> ignored_
> ignored_
>   
>   
> langid
> dedupe
> ignore-commit-from-client
>  
> 
>
> langid works but dedup is not activated (signature field is disappears).
>
> I use Solr 6.3. How can I solve this problem?
>
> Kind Regards,
> Furkan KAMACI
>

Update Request Processors are Not Chained

2018-10-04 Thread Furkan KAMACI

I've defined my update processors as:


   
  
content
en,tr
language_code
other
true
true
  

   
   
 

 
   
 true
 signature
 false
 content
 3
 org.apache.solr.update.processor.TextProfileSignature
   
   
   
 

 
   
 200
   
   
   
   
 

My /update/extract request handler is as follows:


  
true
true
ignored_
content
ignored_
ignored_
  
  
dedupe
langid
ignore-commit-from-client
 


dedupe chain works nd signature field is populated but langid processor is
not triggered at this combination. When I change their places:


  
true
true
ignored_
content
ignored_
ignored_
  
  
langid
dedupe
ignore-commit-from-client
 


langid works but dedup is not activated (signature field is disappears).

I use Solr 6.3. How can I solve this problem?

Kind Regards,
Furkan KAMACI

Re: Solr 6.6 LanguageDetector

2018-10-03 Thread Furkan KAMACI

Here is my schema configuration:

   
   
   
   


On Wed, Oct 3, 2018 at 10:50 AM Furkan KAMACI 
wrote:

> Hi,
>
> I use Solr 6.6 and try to test automatic language detection. I've added
> these configuration into my solrconfig.xml.
>
> 
> class="org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory">
>   
> content
> en,tr
> language_code
> other
> true
> true
>   
> 
>
>
>  
> ...
>  startup="lazy"
>   class="solr.extraction.ExtractingRequestHandler" >
> 
>   true
>   true
>   ignored_
>   content
>   ignored_
>   ignored_
> 
> 
>   dedupe
>   langid
>   ignore-commit-from-client
>
>   
>
> content field is populated but content_en, content_tr, content_other and
> language_code fields are empty.
>
> What I miss?
>
> Kind Regards,
> Furkan KAMACI
>

Solr 6.6 LanguageDetector

2018-10-03 Thread Furkan KAMACI

Hi,

I use Solr 6.6 and try to test automatic language detection. I've added
these configuration into my solrconfig.xml.


   
  
content
en,tr
language_code
other
true
true
  

   
   
 
...
  

  true
  true
  ignored_
  content
  ignored_
  ignored_


  dedupe
  langid
  ignore-commit-from-client
   
  

content field is populated but content_en, content_tr, content_other and
language_code fields are empty.

What I miss?

Kind Regards,
Furkan KAMACI

Re: Java 9

2017-11-09 Thread Furkan KAMACI

Hi,

Here is an explanation about deprecation of
https://docs.oracle.com/javase/9/gctuning/concurrent-mark-sweep-cms-collector.htm

Kind Regards,
Furkan KAMACI

On Tue, Nov 7, 2017 at 10:46 AM, Daniel Collins <danwcoll...@gmail.com>
wrote:

> Oh, blimey, have Oracle gone with Ubuntu-style numbering now? :)
>
> On 7 November 2017 at 08:27, Markus Jelsma <markus.jel...@openindex.io>
> wrote:
>
> > Shawn,
> >
> > There won't be a Java 10, we'll get Java 18.3 instead. After 9 it is a
> > guess when CMS and friends are gone.
> >
> > Regards,
> > Markus
> >
> >
> >
> > -Original message-
> > > From:Shawn Heisey <apa...@elyograg.org>
> > > Sent: Tuesday 7th November 2017 0:24
> > > To: solr-user@lucene.apache.org
> > > Subject: Re: Java 9
> > >
> > > On 11/6/2017 3:07 PM, Petersen, Robert (Contr) wrote:
> > > > Anyone else been noticing this this msg when starting up solr with
> > java 9? (This is just an FYI and not a real question)
> > > >
> > > > Java HotSpot(TM) 64-Bit Server VM warning: Option UseConcMarkSweepGC
> > was deprecated in version 9.0 and will likely be removed in a future
> > release.
> > > > Java HotSpot(TM) 64-Bit Server VM warning: Option UseParNewGC was
> > deprecated in version 9.0 and will likely be removed in a future release.
> > >
> > > I have not tried Java 9 yet.
> > >
> > > Looks like G1 is now the default garbage collector.  I did not know
> that
> > > they were deprecating CMS and ParNew ... that's a little surprising.
> > > Solr's default garbage collection tuning uses those two collectors.  It
> > > is likely that those choices will be available in all versions of Java
> > > 9.  It would be very uncharacteristic for Oracle to take action on
> > > removing them until version 10, possibly later.
> > >
> > > If it were solely up to me, I would adjust Solr's startup script to use
> > > the G1 collector by default, eliminating the warnings you're seeing.
> > > It's not just up to me though.  Lucene documentation says to NEVER use
> > > the G1 collector because they believe it to be unpredictable and have
> > > the potential to cause problems.  I personally have never had any
> issues
> > > with it.  There is *one* Lucene issue mentioning problems with G1GC,
> and
> > > that issue is *specific* to the 32-bit JVM, which is not recommended
> > > because of the limited amount of memory it can use.
> > >
> > > My experiments with GC tuning show the G1 collector (now default in
> Java
> > > 9) to have very good characteristics with Solr.  I have a personal page
> > > on the Solr wiki that covers those experiments.
> > >
> > > https://wiki.apache.org/solr/ShawnHeisey
> > >
> > > Thanks,
> > > Shawn
> > >
> > >
> >
>

Re: solr cloud updatehandler stats mismatch

2017-11-09 Thread Furkan KAMACI

Hi Wei,

Do you compare it with files which are under /var/solr/logs by default?

Kind Regards,
Furkan KAMACI

On Sun, Nov 5, 2017 at 6:59 PM, Wei <weiwan...@gmail.com> wrote:

> Hi,
>
> I use the following api to track the number of update requests:
>
> /solr/collection1/admin/mbeans?cat=UPDATE=true=json
>
>
> Result:
>
>
>- class: "org.apache.solr.handler.UpdateRequestHandler",
>- version: "6.4.2.1",
>- description: "Add documents using XML (with XSLT), CSV, JSON, or
>javabin",
>- src: null,
>- stats:
>{
>   - handlerStart: 1509824945436,
>   - requests: 106062,
>   - ...
>
>
> I am quite confused that the number of requests reported above is quite
> different from the count from solr access logs. A few times the handler
> stats is much higher: handler reports ~100k requests but in the access log
> there are only 5k update requests. What could be the possible cause?
>
> Thanks,
> Wei
>

Re: SolrJ Java API examples

2017-09-17 Thread Furkan KAMACI

Hi Vishal,

You can also check here:
https://lucene.apache.org/solr/guide/6_6/using-solrj.html#using-solrj You
can get enough information about how to use it.

Kind Regards,
Furkan KAMACI

On Thu, Sep 14, 2017 at 1:25 PM, Leonardo Perez Pulido <
leoperezpul...@gmail.com> wrote:

> Hi,
> This may help:
>
> https://github.com/leoperezpulido/lucene-solr/tree/master/solr/solrj/src/
> test/org/apache/solr/client/solrj
>
> Regards.
>
> On Thu, Sep 14, 2017 at 4:21 AM, Vishal Srivastava <
> vishal.smu@gmail.com
> > wrote:
>
> > Hi,
> > I'm a beginner at SolrJ , and am currently looking to implement and
> > integrate the same at my current organisation using Java .
> >
> > After a lot of research, I failed to find any good material / examples
> for
> > SolrJ 's Java library that I could use as reference.
> >
> > Please suggest some good material.
> >
> > Thanks a ton.
> >
> > Vishal Srivastava.
> >
>

Re: Solr Spatial Index and Data

2017-09-17 Thread Furkan KAMACI

Hi Can,

For your first question: you should share more information with us as Rick
indicated. Do you have any errors, do you have unique ids or not etc?

For the second one: you should read here:
https://cwiki.apache.org/confluence/display/solr/Spatial+Search and ask
your questions if you have any.

Kind Regards,
Furkan KAMACI

On Thu, Sep 14, 2017 at 1:34 PM, Rick Leir <rl...@leirtech.com> wrote:

> hi Can Ezgi
> > First of all, i want to use spatial index for my data include polyghons
> and points. But solr indexed first 18 rows, other rows not indexed.
>
> Do all rows have a unique id field?
>
> Are there errors in the logfile?
> cheers -- Rick
>
>
> .
>

Re: index new discovered fileds of different types

2017-07-05 Thread Furkan KAMACI

Hi Thaer,

Do you use schemeless mode [1] ?

Kind Regards,
Furkan KAMACI

[1] https://cwiki.apache.org/confluence/display/solr/Schemaless+Mode

On Wed, Jul 5, 2017 at 4:23 PM, Thaer Sammar <t.sam...@geophy.com> wrote:

> Hi,
> We are trying to index documents of different types. Document have
> different fields. fields are known at indexing time. We run a query on a
> database and we index what comes using query variables as field names in
> solr. Our current solution: we use dynamic fields with prefix, for example
> feature_i_*, the issue with that
> 1) we need to define the type of the dynamic field and to be able to cover
> the type of discovered fields we define the following
> feature_i_* for integers, feature_t_* for string, feature_d_* for double,
> 
> 1.a) this means we need to check the type of the discovered field and then
> put in the corresponding dynamic field
> 2) at search time, we need to know the right prefix
> We are looking for help to find away to ignore the prefix and check of the
> type
>
> regards,
> Thaer

Re: Automatically Restart Solr

2017-07-02 Thread Furkan KAMACI

Hi Jeck,

Here is the documentation about how you can run Solr as service:
https://lucene.apache.org/solr/guide/6_6/taking-solr-to-production.html

However, as far as I see you use Windows as operating system. There is
currently an open issue for creating scripts to run as a Windows Service:
https://issues.apache.org/jira/browse/SOLR-7105 but not yet completed.

Could you check this:
http://coding-art.blogspot.com.tr/2016/07/running-solr-61-as-windows-service.html

Kind Regards,
Furkan KAMACI


On Sun, Jul 2, 2017 at 6:12 PM, rojerick luna <rhl...@yahoo.com.invalid>
wrote:

> Hi,
>
> Anyone who successfully set this up? Thanks
>
> Best Regards,
> Jeck
>
> > On 20 Jun 2017, at 7:10 PM, rojerick luna <rhl...@yahoo.com.INVALID>
> wrote:
> >
> > Hi,
> >
> > I'm trying to automate Solr restart every week.
> >
> > I created a stop.bat and updated the start.bat which I found on an
> article online. Using stop.bat and start.bat is working fine. However when
> I created a Task Scheduler (Windows Scheduler) and setup the frequency to
> stop and start (using the bat files), it's not working; the Solr app didn't
> restart.
> >
> > Please let me know if you have successfully tried it and send me steps
> how you've setup the Task Scheduler.
> >
> > Best Regards,
> > Jeck Luna
>
>

SSN Regex Search

2017-06-22 Thread Furkan KAMACI

Hi,

How can I search for SSN regex pattern which overwhelms special dash
character issue?

As you know that /[0-9]{3}-[0-9]{2}-[0-9]{4}/ will not work as intended.

Kind Regards,
Furkan KAMACI

Solr SQL Subquery Support

2017-04-24 Thread Furkan KAMACI

Hi,

Does Solr SQL supports subqueries?

Kind Regards,
Furkan KAMACI

Re: Inconsistent Counts in Cloud at Solr SQL Queries

2017-04-24 Thread Furkan KAMACI

Thanks for the answer! Does facet uses Solr Json requests or new facet API
(which is faster than the old one)?

On Mon, Apr 24, 2017 at 2:18 PM, Joel Bernstein <joels...@gmail.com> wrote:

> SQL has two aggregation modes: facet and map_reduce. Facet uses the json
> facet API directly so SOLR-7452 would apply if it hasn't been resolved yet.
> map_reduce always gives accurate results regardless of the cardinality but
> is slower. To increase performance using map_reduce you need to increase
> the size of the cluster (workers, shards, replicas).
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Mon, Apr 24, 2017 at 5:09 AM, Furkan KAMACI <furkankam...@gmail.com>
> wrote:
>
> > Hi,
> >
> > As you know that json facet api returns inconsistent counts in cloud set
> up
> > (SOLR-7452). I would like to learn that is the situation same for Solr
> SQL
> > queries too?
> >
> > Kind Regards,
> > Furkan KAMACI
> >
>

Inconsistent Counts in Cloud at Solr SQL Queries

2017-04-24 Thread Furkan KAMACI

Hi,

As you know that json facet api returns inconsistent counts in cloud set up
(SOLR-7452). I would like to learn that is the situation same for Solr SQL
queries too?

Kind Regards,
Furkan KAMACI

Re: Solr Stream Content from URL

2017-04-19 Thread Furkan KAMACI

Hi Alexandre,

My content is protected via Basic Authentication. Is it possible to use
Basic Authentication with Solr Content Streams?

Kind Regards,
Furkan KAMACI

On Wed, Apr 19, 2017 at 9:13 PM, Alexandre Rafalovitch <arafa...@gmail.com>
wrote:

> Have you tried stream.url parameter after enabling the
> enableRemoteStreaming flag?
> https://cwiki.apache.org/confluence/display/solr/Content+Streams
>
> Regards,
>Alex.
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 19 April 2017 at 13:27, Furkan KAMACI <furkankam...@gmail.com> wrote:
> > Hi,
> >
> > Is it possible to stream a CSV content from URL to Solr?
> >
> > I've tried URLDataSource but could not figure out about what to use as
> > document.
> >
> > Kind Regards,
> > Furkan KAMACI
>

Solr Stream Content from URL

2017-04-19 Thread Furkan KAMACI

Hi,

Is it possible to stream a CSV content from URL to Solr?

I've tried URLDataSource but could not figure out about what to use as
document.

Kind Regards,
Furkan KAMACI

Re: Filter Facet Query

2017-04-18 Thread Furkan KAMACI

Hi Alex,

I found the reason, thanks for the help. Facet shows all possible values
including 0.

Could you help on my last question:

I have facet results like:

"", 9
"research",6
"development",3


I want to filter empty string from my facet "" (I don't want to add it to
fq, just filter from facets). How can I do that?

On Tue, Apr 18, 2017 at 11:52 AM, Alexandre Rafalovitch <arafa...@gmail.com>
wrote:

> Are you saying that all the values in the facet are zero with that
> query? The query you gave seems to be the super-basic faceting code,
> so maybe something super-basic is missing.
>
> E.g.
> *) Did you check that the documents you get back actually have any
> values in that field to facet on?
> *) Did you try making a query just by ID for a document that
> definitely has the value in that field?
> *) Did you do the query with echoParams=all to see that you are not
> having any hidden extra parameters that get appended?
>
> Regards,
>Alex.
>
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 18 April 2017 at 11:43, Furkan KAMACI <furkankam...@gmail.com> wrote:
> > OK, it returns 0 results every time.
> >
> > So,
> >
> > I want to filter out research values with empty string ("") from facet
> > result. How can I do that?
> >
> >
> > On Tue, Apr 18, 2017 at 8:53 AM, Furkan KAMACI <furkankam...@gmail.com>
> > wrote:
> >
> >> First problem is they do not match with main query.
> >>
> >> 18 Nis 2017 Sal, saat 01:54 tarihinde Dave <
> hastings.recurs...@gmail.com>
> >> şunu yazdı:
> >>
> >>> Min.count is what you're looking for to get non 0 facets
> >>>
> >>> > On Apr 17, 2017, at 6:51 PM, Furkan KAMACI <furkankam...@gmail.com>
> >>> wrote:
> >>> >
> >>> > My query:
> >>> >
> >>> > /select?facet.field=research=on=content:test
> >>> >
> >>> > Q1) Facet returns research values with 0 counts which has a research
> >>> value
> >>> > that is not from a document matched by main query (content:test). Is
> >>> that
> >>> > usual?
> >>> >
> >>> > Q2) I want to filter out research values with empty string ("") from
> >>> facet
> >>> > result. How can I do that?
> >>> >
> >>> > Kind Regards,
> >>> > Furkan KAMACI
> >>>
> >>
>

Re: Filter Facet Query

2017-04-18 Thread Furkan KAMACI

OK, it returns 0 results every time.

So,

I want to filter out research values with empty string ("") from facet
result. How can I do that?


On Tue, Apr 18, 2017 at 8:53 AM, Furkan KAMACI <furkankam...@gmail.com>
wrote:

> First problem is they do not match with main query.
>
> 18 Nis 2017 Sal, saat 01:54 tarihinde Dave <hastings.recurs...@gmail.com>
> şunu yazdı:
>
>> Min.count is what you're looking for to get non 0 facets
>>
>> > On Apr 17, 2017, at 6:51 PM, Furkan KAMACI <furkankam...@gmail.com>
>> wrote:
>> >
>> > My query:
>> >
>> > /select?facet.field=research=on=content:test
>> >
>> > Q1) Facet returns research values with 0 counts which has a research
>> value
>> > that is not from a document matched by main query (content:test). Is
>> that
>> > usual?
>> >
>> > Q2) I want to filter out research values with empty string ("") from
>> facet
>> > result. How can I do that?
>> >
>> > Kind Regards,
>> > Furkan KAMACI
>>
>

Re: Filter Facet Query

2017-04-17 Thread Furkan KAMACI

First problem is they do not match with main query.

18 Nis 2017 Sal, saat 01:54 tarihinde Dave <hastings.recurs...@gmail.com>
şunu yazdı:

> Min.count is what you're looking for to get non 0 facets
>
> > On Apr 17, 2017, at 6:51 PM, Furkan KAMACI <furkankam...@gmail.com>
> wrote:
> >
> > My query:
> >
> > /select?facet.field=research=on=content:test
> >
> > Q1) Facet returns research values with 0 counts which has a research
> value
> > that is not from a document matched by main query (content:test). Is that
> > usual?
> >
> > Q2) I want to filter out research values with empty string ("") from
> facet
> > result. How can I do that?
> >
> > Kind Regards,
> > Furkan KAMACI
>

Filter Facet Query

2017-04-17 Thread Furkan KAMACI

My query:

/select?facet.field=research=on=content:test

Q1) Facet returns research values with 0 counts which has a research value
that is not from a document matched by main query (content:test). Is that
usual?

Q2) I want to filter out research values with empty string ("") from facet
result. How can I do that?

Kind Regards,
Furkan KAMACI

Re: Filter if Field Exists

2017-04-17 Thread Furkan KAMACI

@Alexandre Rafalovitch,

I could define empty string => "" as default value but than I do facet on
that field too. I will need to filter empty strings from facet generation
logic. By the way, which one is faster:

either defining empty string as default value and appending (OR type:"") to
queries
or negative search clauses?

On Mon, Apr 17, 2017 at 2:22 PM, Furkan KAMACI <furkankam...@gmail.com>
wrote:

> On the other hand, that query does not do what I want.
>
> On Mon, Apr 17, 2017 at 2:18 PM, Furkan KAMACI <furkankam...@gmail.com>
> wrote:
>
>> Btw, what is the difference between
>>
>> +name:test +(type:research (*:* -type:[* TO *]))
>>
>> and
>>
>> +name:test +(type:research -type:[* TO *])
>>
>> On Mon, Apr 17, 2017 at 1:33 PM, Furkan KAMACI <furkankam...@gmail.com>
>> wrote:
>>
>>> Actually, amount of documents which have 'type' field is relatively too
>>> small across all documents at index.
>>>
>>> On Mon, Apr 17, 2017 at 7:08 AM, Alexandre Rafalovitch <
>>> arafa...@gmail.com> wrote:
>>>
>>>> What about setting a default value for the field? That is probably
>>>> faster than negative search clauses?
>>>>
>>>> Regards,
>>>>Alex.
>>>> 
>>>> http://www.solr-start.com/ - Resources for Solr users, new and
>>>> experienced
>>>>
>>>>
>>>> On 16 April 2017 at 23:58, Mikhail Khludnev <m...@apache.org> wrote:
>>>> > +name:test +(type:research (*:* -type:[* TO *]))
>>>> >
>>>> > On Sun, Apr 16, 2017 at 11:47 PM, Furkan KAMACI <
>>>> furkankam...@gmail.com>
>>>> > wrote:
>>>> >
>>>> >> Hi,
>>>> >>
>>>> >> I have a schema like:
>>>> >>
>>>> >> name,
>>>> >> department,
>>>> >> type
>>>> >>
>>>> >> type is an optional field. Some documents don't have that field.
>>>> Let's
>>>> >> assume I have these:
>>>> >>
>>>> >> Doc 1:
>>>> >> name: test
>>>> >> type: research
>>>> >>
>>>> >> Doc 2:
>>>> >> name: test
>>>> >> type: developer
>>>> >>
>>>> >> Doc 3:
>>>> >> name: test
>>>> >>
>>>> >> I want to search name: test and type:research if type field exists
>>>> (result
>>>> >> will be Doc 1 and Doc 3).
>>>> >>
>>>> >> How can I do that?
>>>> >>
>>>> >> Kind Regards,
>>>> >> Furkan KAMACI
>>>> >>
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Sincerely yours
>>>> > Mikhail Khludnev
>>>>
>>>
>>>
>>
>

Re: Filter if Field Exists

2017-04-17 Thread Furkan KAMACI

On the other hand, that query does not do what I want.

On Mon, Apr 17, 2017 at 2:18 PM, Furkan KAMACI <furkankam...@gmail.com>
wrote:

> Btw, what is the difference between
>
> +name:test +(type:research (*:* -type:[* TO *]))
>
> and
>
> +name:test +(type:research -type:[* TO *])
>
> On Mon, Apr 17, 2017 at 1:33 PM, Furkan KAMACI <furkankam...@gmail.com>
> wrote:
>
>> Actually, amount of documents which have 'type' field is relatively too
>> small across all documents at index.
>>
>> On Mon, Apr 17, 2017 at 7:08 AM, Alexandre Rafalovitch <
>> arafa...@gmail.com> wrote:
>>
>>> What about setting a default value for the field? That is probably
>>> faster than negative search clauses?
>>>
>>> Regards,
>>>Alex.
>>> 
>>> http://www.solr-start.com/ - Resources for Solr users, new and
>>> experienced
>>>
>>>
>>> On 16 April 2017 at 23:58, Mikhail Khludnev <m...@apache.org> wrote:
>>> > +name:test +(type:research (*:* -type:[* TO *]))
>>> >
>>> > On Sun, Apr 16, 2017 at 11:47 PM, Furkan KAMACI <
>>> furkankam...@gmail.com>
>>> > wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> I have a schema like:
>>> >>
>>> >> name,
>>> >> department,
>>> >> type
>>> >>
>>> >> type is an optional field. Some documents don't have that field. Let's
>>> >> assume I have these:
>>> >>
>>> >> Doc 1:
>>> >> name: test
>>> >> type: research
>>> >>
>>> >> Doc 2:
>>> >> name: test
>>> >> type: developer
>>> >>
>>> >> Doc 3:
>>> >> name: test
>>> >>
>>> >> I want to search name: test and type:research if type field exists
>>> (result
>>> >> will be Doc 1 and Doc 3).
>>> >>
>>> >> How can I do that?
>>> >>
>>> >> Kind Regards,
>>> >> Furkan KAMACI
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Sincerely yours
>>> > Mikhail Khludnev
>>>
>>
>>
>

Re: Filter if Field Exists

2017-04-17 Thread Furkan KAMACI

Btw, what is the difference between

+name:test +(type:research (*:* -type:[* TO *]))

and

+name:test +(type:research -type:[* TO *])

On Mon, Apr 17, 2017 at 1:33 PM, Furkan KAMACI <furkankam...@gmail.com>
wrote:

> Actually, amount of documents which have 'type' field is relatively too
> small across all documents at index.
>
> On Mon, Apr 17, 2017 at 7:08 AM, Alexandre Rafalovitch <arafa...@gmail.com
> > wrote:
>
>> What about setting a default value for the field? That is probably
>> faster than negative search clauses?
>>
>> Regards,
>>Alex.
>> 
>> http://www.solr-start.com/ - Resources for Solr users, new and
>> experienced
>>
>>
>> On 16 April 2017 at 23:58, Mikhail Khludnev <m...@apache.org> wrote:
>> > +name:test +(type:research (*:* -type:[* TO *]))
>> >
>> > On Sun, Apr 16, 2017 at 11:47 PM, Furkan KAMACI <furkankam...@gmail.com
>> >
>> > wrote:
>> >
>> >> Hi,
>> >>
>> >> I have a schema like:
>> >>
>> >> name,
>> >> department,
>> >> type
>> >>
>> >> type is an optional field. Some documents don't have that field. Let's
>> >> assume I have these:
>> >>
>> >> Doc 1:
>> >> name: test
>> >> type: research
>> >>
>> >> Doc 2:
>> >> name: test
>> >> type: developer
>> >>
>> >> Doc 3:
>> >> name: test
>> >>
>> >> I want to search name: test and type:research if type field exists
>> (result
>> >> will be Doc 1 and Doc 3).
>> >>
>> >> How can I do that?
>> >>
>> >> Kind Regards,
>> >> Furkan KAMACI
>> >>
>> >
>> >
>> >
>> > --
>> > Sincerely yours
>> > Mikhail Khludnev
>>
>
>

Re: Filter if Field Exists

2017-04-17 Thread Furkan KAMACI

Actually, amount of documents which have 'type' field is relatively too
small across all documents at index.

On Mon, Apr 17, 2017 at 7:08 AM, Alexandre Rafalovitch <arafa...@gmail.com>
wrote:

> What about setting a default value for the field? That is probably
> faster than negative search clauses?
>
> Regards,
>Alex.
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 16 April 2017 at 23:58, Mikhail Khludnev <m...@apache.org> wrote:
> > +name:test +(type:research (*:* -type:[* TO *]))
> >
> > On Sun, Apr 16, 2017 at 11:47 PM, Furkan KAMACI <furkankam...@gmail.com>
> > wrote:
> >
> >> Hi,
> >>
> >> I have a schema like:
> >>
> >> name,
> >> department,
> >> type
> >>
> >> type is an optional field. Some documents don't have that field. Let's
> >> assume I have these:
> >>
> >> Doc 1:
> >> name: test
> >> type: research
> >>
> >> Doc 2:
> >> name: test
> >> type: developer
> >>
> >> Doc 3:
> >> name: test
> >>
> >> I want to search name: test and type:research if type field exists
> (result
> >> will be Doc 1 and Doc 3).
> >>
> >> How can I do that?
> >>
> >> Kind Regards,
> >> Furkan KAMACI
> >>
> >
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
>

Filter if Field Exists

2017-04-16 Thread Furkan KAMACI

Hi,

I have a schema like:

name,
department,
type

type is an optional field. Some documents don't have that field. Let's
assume I have these:

Doc 1:
name: test
type: research

Doc 2:
name: test
type: developer

Doc 3:
name: test

I want to search name: test and type:research if type field exists (result
will be Doc 1 and Doc 3).

How can I do that?

Kind Regards,
Furkan KAMACI

JSON Facet API Virtual Field Support

2017-03-24 Thread Furkan KAMACI

Hi,

I test JSON Facet API of Solr. Is it possible to create a virtual field
which is generated by using existing fields at response and supports
elementary arithmetic operations?

Example:

Schema fields:

products,
sold_products,
date

I want to run a date range facet and add another field to response which is
the percentage of sold products (ratio will be calculated as sold_products
* 100 / products)

Kind Regards,
Furkan KAMACI

Count Dates Given A Range in a Multivalued Field

2017-03-20 Thread Furkan KAMACI

Hi All,

I have a multivalued date field i.e.:

[2017-02-06T00:00:00Z,2017-02-09T00:00:00Z,2017-03-04T00:00:00Z]

I want to count how many dates exist given a data range within such field.
i.e.

start: 2017-02-01T00:00:00Z
end: 2017-02-28T00:00:00Z

result is 2 (2017-02-06T00:00:00Z and 2017-02-09T00:00:00Z). I want to do
it with JSON Facet API.

How can I do it?

Re: Managed Schema multiValued Predict Problem

2017-03-13 Thread Furkan KAMACI

You are right, I mean schemaless mode. I saw that it's your answer ;) I've
edited solrconfig.xml and fixed it. Thanks!

On Mon, Mar 13, 2017 at 5:46 PM, Alexandre Rafalovitch <arafa...@gmail.com>
wrote:

> There is managed schema, which means it is editable via API, and there
> is 'schemaless' mode that uses that to auto-define the field based on
> the first occurance.
>
> 'schemaless' mode does not know if the field will be multi-valued the
> first time it sees content for that field. So, all the fields created
> automatically are multivalued. You can change the definition or you
> can define the field explicitly using the API or Admin UI.
>
> 'schemaless' is only there really for a quick prototyping with unknown
> content.
>
> Regards,
>Alex.
> P.s. That's my SO answer :-) Glad you found it useful.
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 13 March 2017 at 11:15, Furkan KAMACI <furkankam...@gmail.com> wrote:
> > Hi,
> >
> > I generate dummy documents to test Solr 6.4.2. I create a field like that
> > at my test code:
> >
> > int customCount = r.nextInt(500);
> > document.addField("custom_count", customCount);
> >
> > This field is indexed as:
> >
> >   org.apache.solr.schema.TrieLongField
> >
> > and
> >
> > Multivalued.
> >
> > I want to use FieldCache on multivalued field and don't want it to be
> > multivalued. When I check managed-schema I see that:
> >
> >> positionIncrementGap="0" docValues="true" precisionStep="0"/>
> >> positionIncrementGap="0" docValues="true" multiValued="true"
> > precisionStep="0"/>
> >
> > So, it seems that it's predicted as longs instead of long.
> >
> > What is the reason behind that?
> >
> > Kind Regards,
> > Furkan KAMACI
>

Re: Managed Schema multiValued Predict Problem

2017-03-13 Thread Furkan KAMACI

OK, I found the answer here:
http://stackoverflow.com/questions/38730035/solr-schemaless-mode-creating-fields-as-multivalued

On Mon, Mar 13, 2017 at 5:15 PM, Furkan KAMACI <furkankam...@gmail.com>
wrote:

> Hi,
>
> I generate dummy documents to test Solr 6.4.2. I create a field like that
> at my test code:
>
> int customCount = r.nextInt(500);
> document.addField("custom_count", customCount);
>
> This field is indexed as:
>
>   org.apache.solr.schema.TrieLongField
>
> and
>
> Multivalued.
>
> I want to use FieldCache on multivalued field and don't want it to be
> multivalued. When I check managed-schema I see that:
>
>positionIncrementGap="0" docValues="true" precisionStep="0"/>
>positionIncrementGap="0" docValues="true" multiValued="true"
> precisionStep="0"/>
>
> So, it seems that it's predicted as longs instead of long.
>
> What is the reason behind that?
>
> Kind Regards,
> Furkan KAMACI
>
>

Managed Schema multiValued Predict Problem

2017-03-13 Thread Furkan KAMACI

Hi,

I generate dummy documents to test Solr 6.4.2. I create a field like that
at my test code:

int customCount = r.nextInt(500);
document.addField("custom_count", customCount);

This field is indexed as:

  org.apache.solr.schema.TrieLongField

and

Multivalued.

I want to use FieldCache on multivalued field and don't want it to be
multivalued. When I check managed-schema I see that:

  
  

So, it seems that it's predicted as longs instead of long.

What is the reason behind that?

Kind Regards,
Furkan KAMACI

Re: Predicting Date Field at Schemaless Mode

2017-03-13 Thread Furkan KAMACI

Everything works well but type is predicted as String instead of Date. I
create just plain documents as follows:

SimpleDateFormat simpleDateFormat = new
SimpleDateFormat("-MM-dd'T'HH:mm");
Calendar startDate = new GregorianCalendar(2017, r.nextInt(6),
r.nextInt(28));
document.addField("custom_start",
simpleDateFormat.format(startDate.getTime()));
...
solrClient.add(document);
...
solrClient.commit();

On Mon, Mar 13, 2017 at 4:44 PM, Alexandre Rafalovitch <arafa...@gmail.com>
wrote:

> Any other definitions in that URP chain are triggered?
>
> Are you seeing this in a nested document by any chance?
>
> Regards,
>Alex.
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 13 March 2017 at 10:29, Furkan KAMACI <furkankam...@gmail.com> wrote:
> > Hi,
> >
> > I'm testing schemaless mode of Solr 6.4.2. Solr predicts fields types
> when
> > I generate dummy data and index it to Solr. However I could not make Solr
> > to predict date fields. I tried that:
> >
> >  "custom_start":["2017-05-16T00:00"]
> >
> > which is a date parse result of SimpleDateFormat("-MM-dd'T'HH:mm");
> >
> > and
> >
> >  "custom_start":["2017-05-16"]
> >
> > from SimpleDateFormat("-MM-dd");
> >
> > at both scenarios, predicted type is:
> >
> > org.apache.solr.schema.StrField
> >
> > I use fresh version of Solr which does not have custom modifications and
> > has proper solr.ParseDateFieldUpdateProcessorFactory definition.
> >
> > What I'm missing?
> >
> > Kind Regards,
> > Furkan KAMACI
>

Predicting Date Field at Schemaless Mode

2017-03-13 Thread Furkan KAMACI

Hi,

I'm testing schemaless mode of Solr 6.4.2. Solr predicts fields types when
I generate dummy data and index it to Solr. However I could not make Solr
to predict date fields. I tried that:

 "custom_start":["2017-05-16T00:00"]

which is a date parse result of SimpleDateFormat("-MM-dd'T'HH:mm");

and

 "custom_start":["2017-05-16"]

from SimpleDateFormat("-MM-dd");

at both scenarios, predicted type is:

org.apache.solr.schema.StrField

I use fresh version of Solr which does not have custom modifications and
has proper solr.ParseDateFieldUpdateProcessorFactory definition.

What I'm missing?

Kind Regards,
Furkan KAMACI

Re: Query Elevation Component as a Managed Resource

2017-01-10 Thread Furkan KAMACI

Hi Jeffery,

I was looking whether an issue is raised for it or not. Thanks for pointing
it, I'm planning to create a patch.

Kind Regards,
Furkan KAMACI

On Mon, Jan 9, 2017 at 6:44 AM, Jeffery Yuan <yuanyun...@gmail.com> wrote:

> I am looking for same things.
>
> Seems Solr doesn't support this.
>
> Maybe you can vote for https://issues.apache.org/jira/browse/SOLR-6092, so
> add a patch for it :)
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Query-Elevation-Component-as-a-Managed-
> Resource-tp4312089p4313034.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Can I use SolrJ 6.3.0 to talk to a Solr 5.2.3 server?

2017-01-03 Thread Furkan KAMACI

Hi Jennifer,

Take a look at index compatibility beside dependencies. Here is the
explanation:

Index Format Changes

Solr 6 has no support for reading Lucene/Solr 4.x and earlier indexes.  Be
sure to run the Lucene IndexUpgrader included with Solr 5.5 if you might
still have old 4x formatted segments in your index. Alternatively: fully
optimize your index with Solr 5.5 to make sure it consists only of one
up-to-date index segment.

You can read more from here:
https://cwiki.apache.org/confluence/display/solr/Major+Changes+from+Solr+5+to+Solr+6

Kind Regards,
Furkan KAMACI


On Tue, Jan 3, 2017 at 7:35 PM, Jennifer Coston <
jennifer.cos...@raytheon.com> wrote:

> Hello,
>
> I am running into a conflict with Solr and ElasticSearch. We are trying to
> add support for Elastic Search 5.1.1 which requires Lucene 6.3.0 to an
> existing system that uses Solr 5.2.3. At the moment I am using SolrJ 5.3.1
> to talk to the 5.2.3 Server. I was hoping I could just update the SolrJ
> libraries to 6.3.0 so the Lucene conflict goes away, but when I try to run
> my unit tests I'm seeing this error:
>
> java.util.ServiceConfigurationError: Cannot instantiate SPI class:
> org.apache.lucene.codecs.simpletext.SimpleTextPostingsFormat
> at org.apache.lucene.util.NamedSPILoader.reload(
> NamedSPILoader.java:82)
> at org.apache.lucene.codecs.PostingsFormat.
> reloadPostingsFormats(PostingsFormat.java:132)
> at org.apache.solr.core.SolrResourceLoader.
> reloadLuceneSPI(SolrResourceLoader.java:237)
> at org.apache.solr.core.SolrResourceLoader.(
> SolrResourceLoader.java:182)
> at org.apache.solr.core.SolrResourceLoader.(
> SolrResourceLoader.java:142)
> at org.apache.solr.core.CoreContainer.(
> CoreContainer.java:217)
> at com.rtn.iaf.catalog.test.SolrAnalyticClientTest.
> setUpBeforeClass(SolrAnalyticClientTest.java:59)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.junit.runners.model.FrameworkMethod$1.
> runReflectiveCall(FrameworkMethod.java:50)
> at org.junit.internal.runners.
> model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at org.junit.runners.model.FrameworkMethod.
> invokeExplosively(FrameworkMethod.java:47)
> at org.junit.internal.runners.statements.RunBefores.
> evaluate(RunBefores.java:24)
> at org.junit.internal.runners.
> statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.runners.ParentRunner.run(ParentRunner.
> java:363)
> at org.eclipse.jdt.internal.junit4.runner.
> JUnit4TestReference.run(JUnit4TestReference.java:86)
> at org.eclipse.jdt.internal.junit.runner.TestExecution.
> run(TestExecution.java:38)
> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.
> runTests(RemoteTestRunner.java:459)
> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.
> runTests(RemoteTestRunner.java:675)
> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.
> run(RemoteTestRunner.java:382)
> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.
> main(RemoteTestRunner.java:192)
> Caused by: java.lang.IllegalAccessException: Class 
> org.apache.lucene.util.NamedSPILoader
> can not access a member of class 
> org.apache.lucene.codecs.simpletext.SimpleTextPostingsFormat
> with modifiers "public"
> at sun.reflect.Reflection.ensureMemberAccess(Reflection.
> java:102)
> at java.lang.Class.newInstance(Class.java:436)
> at org.apache.lucene.util.NamedSPILoader.reload(
> NamedSPILoader.java:72)
> ... 22 more
> Is it possible to talk to the 5.2.3 Server using SolrJ 6.3.0?
>
> Here are the Solr Dependencies I have in my pom.xml:
>
> 
>   
>  org.apache.solr
>  solr-solrj
>  6.3.0
>   
>   
>   
>  org.apache.solr
>  solr-core
>  6.3.0
>  test
>  
>
>   jdk.tools
>   jdk.tools
>
>

Query Elevation Component as a Managed Resource

2017-01-03 Thread Furkan KAMACI

Hi,

Can we access to Query Elevation Component as a Managed Resource? If not, I
would like to add that functionality.

Kind Regards,
Furkan KAMACI

Empty Highlight Problem - Solr 6.3.0

2016-12-23 Thread Furkan KAMACI

Hi All,

I'm trying highlighter component at Solr 6.3. I have a problem when I index
PDF files. I know that given keyword exists at result document (it is
returned as result because of a hit at document as well), highlighting
field is empty at response.

I'm suspicious about it happens documents which has large content. How can
I solve this problem. I've tried Standard Highlighter and FastVector
Highlighter (termVectors, termPositions, and termOffsets are enabled for hl
fields) but result is same?

Kind Regards,
Furkan KAMACI

FuzzyLookupFactory throws FuzzyLookupFactory

2016-12-22 Thread Furkan KAMACI

Hi,

When I try suggester component and use FuzzyLookupFactory I get that error:

"error": {
"msg": "java.lang.StackOverflowError",
"trace": "java.lang.RuntimeException: FuzzyLookupFactory n\tat
org.apache.solr.servlet.HttpSolrCall.sendError(HttpSolrCall.java:607)\n\tat

I searched on the web and there are some other people who gets that error
too. Responses to such questions indicate that it may be usual if there are
many data on index. However I just index 4 small PDF files and get that
error when I want to construct suggester.

Any ideas?

Kind Regards,
Furkan KAMACI

Limit Suggested Term Counts

2016-12-22 Thread Furkan KAMACI

I have a list to make suggestions on it. When I check the analyser page I
see that field is analysed as I intended. i.e. tokens are:

java
linux
mac

However, when I use BlendedInfixLookupFactory to run a suggestion on that
field it returns me whole paragraph instead of a limited size of terms (I
know that such implementations does return suggestions even desired terms
are inside the term, not the beginning).

Is it possible to limit that suggested term count?

Kind Regards,
Furkan KAMACI

Re: Solr Suggester

2016-12-22 Thread Furkan KAMACI

Hi Emir,

As far as I know, it should be enough to be stored=true for a suggestion
field? Should it be both indexed and stored?

Kind Regards,
Furkan KAMACI

On Thu, Dec 22, 2016 at 11:31 AM, Emir Arnautovic <
emir.arnauto...@sematext.com> wrote:

> That is because my_field_2 is not indexed.
>
> Regards,
> Emir
>
>
> On 21.12.2016 18:04, Furkan KAMACI wrote:
>
>> Hi All,
>>
>> I've a field like that:
>>
>>  >   multiValued="false" />
>>
>>  > stored="true" multiValued="false"/>
>>
>> When I run a suggester on my_field_1 it returns response. However
>> my_field_2 doesn't. I've defined suggester as:
>>
>>    suggester
>>FuzzyLookupFactory
>>DocumentDictionaryFactory
>>
>> What can be the reason?
>>
>> Kind Regards,
>> Furkan KAMACI
>>
>>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>

Solr Suggester

2016-12-21 Thread Furkan KAMACI

Hi All,

I've a field like that:





When I run a suggester on my_field_1 it returns response. However
my_field_2 doesn't. I've defined suggester as:

  suggester
  FuzzyLookupFactory
  DocumentDictionaryFactory

What can be the reason?

Kind Regards,
Furkan KAMACI

Re: Soft commit and reading data just after the commit

2016-12-18 Thread Furkan KAMACI

Hi Lasitha,

First of all, did you check these:

https://cwiki.apache.org/confluence/display/solr/Near+Real+Time+Searching
https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

after that, if you cannot adjust your configuration you can give more
information and we can find a solution.

Kind Regards,
Furkan KAMACI

On Sun, Dec 18, 2016 at 2:28 PM, Lasitha Wattaladeniya <watt...@gmail.com>
wrote:

> Hi furkan,
>
> Thanks for your reply, it is generally a query heavy system. We are using
> realtime indexing for editing the available data
>
> Regards,
> Lasitha
>
> Lasitha Wattaladeniya
> Software Engineer
>
> Mobile : +6593896893 <+65%209389%206893>
> Blog : techreadme.blogspot.com
>
> On Sun, Dec 18, 2016 at 8:12 PM, Furkan KAMACI <furkankam...@gmail.com>
> wrote:
>
>> Hi Lasitha,
>>
>> What is your indexing / querying requirements. Do you have an index
>> heavy/light  - query heavy/light system?
>>
>> Kind Regards,
>> Furkan KAMACI
>>
>> On Sun, Dec 18, 2016 at 11:35 AM, Lasitha Wattaladeniya <
>> watt...@gmail.com>
>> wrote:
>>
>> > Hello devs,
>> >
>> > I'm here with another problem i'm facing. I'm trying to do a commit
>> (soft
>> > commit) through solrj and just after the commit, retrieve the data from
>> > solr (requirement is to get updated data list).
>> >
>> > I'm using soft commit instead of the hard commit, is previously I got an
>> > error "Exceeded limit of maxWarmingSearchers=2, try again later"
>> because of
>> > too many commit requests. Now I have removed the explicit commit and has
>> > let the solr to do the commit using autoSoftCommit *(1 mili second)* and
>> > autoCommit *(30 seconds)* configurations. Now I'm not getting any errors
>> > when i'm committing frequently.
>> >
>> > The problem i'm facing now is, I'm not getting the updated data when I
>> > fetch from solr just after the soft commit. So in this case what are the
>> > best practices to use ? to wait 1 mili second before retrieving data
>> after
>> > soft commit ? I don't feel like waiting from client side is a good
>> option.
>> > Please give me some help from your expert knowledge
>> >
>> > Best regards,
>> > Lasitha Wattaladeniya
>> > Software Engineer
>> >
>> > Mobile : +6593896893
>> > Blog : techreadme.blogspot.com
>> >
>>
>
>

Re: Confusing debug=timing parameter

2016-12-18 Thread Furkan KAMACI

Hi,

Let me explain you *time* *parameters in Solr*:

*Timing* parameter of debug returns information about how long the query
took to process.

*Query time* shows information of how long did it take in Solr to get the
search
results. It doesn't include reading bits from disk, etc.

Also, there is another parameter named as *elapsed time*. It shows time
frame of the query sent to Solr and response is returned. Includes query
time, reading bits from disk, constructing the response and transmissioning
it, etc.

Kind Regards,
Furkan KAMACI

On Sat, Dec 17, 2016 at 6:43 PM, S G <sg.online.em...@gmail.com> wrote:

> Hi,
>
> I am using Solr 4.10 and its response time for the clients is not very
> good.
> Even though the Solr's plugin/stats shows less than 200 milliseconds,
> clients report several seconds in response time.
>
> So I tried using debug-timing parameter from the Solr UI and this is what I
> got.
> Note how the QTime is 2978 while the time in debug-timing is 19320.
>
> What does this mean?
> How can Solr return a result in 3 seconds when time taken between two
> points in the same path is 20 seconds ?
>
> {
>   "responseHeader": {
> "status": 0,
> "QTime": 2978,
> "params": {
>   "q": "*:*",
>   "debug": "timing",
>   "indent": "true",
>   "wt": "json",
>   "_": "1481992653008"
> }
>   },
>   "response": {
> "numFound": 1565135270,
> "start": 0,
> "maxScore": 1,
> "docs": [
>   
> ]
>   },
>   "debug": {
> "timing": {
>   "time": 19320,
>   "prepare": {
> "time": 4,
> "query": {
>   "time": 3
> },
> "facet": {
>   "time": 0
> },
> "mlt": {
>   "time": 0
> },
> "highlight": {
>   "time": 0
> },
> "stats": {
>   "time": 0
> },
> "expand": {
>   "time": 0
> },
> "debug": {
>   "time": 0
> }
>   },
>   "process": {
> "time": 19315,
> "query": {
>   "time": 19309
> },
> "facet": {
>   "time": 0
> },
> "mlt": {
>   "time": 1
> },
> "highlight": {
>   "time": 0
> },
> "stats": {
>   "time": 0
> },
> "expand": {
>   "time": 0
> },
> "debug": {
>   "time": 5
> }
>   }
> }
>   }
> }
>

Re: Soft commit and reading data just after the commit

2016-12-18 Thread Furkan KAMACI

Hi Lasitha,

What is your indexing / querying requirements. Do you have an index
heavy/light  - query heavy/light system?

Kind Regards,
Furkan KAMACI

On Sun, Dec 18, 2016 at 11:35 AM, Lasitha Wattaladeniya <watt...@gmail.com>
wrote:

> Hello devs,
>
> I'm here with another problem i'm facing. I'm trying to do a commit (soft
> commit) through solrj and just after the commit, retrieve the data from
> solr (requirement is to get updated data list).
>
> I'm using soft commit instead of the hard commit, is previously I got an
> error "Exceeded limit of maxWarmingSearchers=2, try again later" because of
> too many commit requests. Now I have removed the explicit commit and has
> let the solr to do the commit using autoSoftCommit *(1 mili second)* and
> autoCommit *(30 seconds)* configurations. Now I'm not getting any errors
> when i'm committing frequently.
>
> The problem i'm facing now is, I'm not getting the updated data when I
> fetch from solr just after the soft commit. So in this case what are the
> best practices to use ? to wait 1 mili second before retrieving data after
> soft commit ? I don't feel like waiting from client side is a good option.
> Please give me some help from your expert knowledge
>
> Best regards,
> Lasitha Wattaladeniya
> Software Engineer
>
> Mobile : +6593896893
> Blog : techreadme.blogspot.com
>

Checking Optimal Values for BM25

2016-12-15 Thread Furkan KAMACI

Hi,

Sole's default similarity is BM25 anymore. Its parameters are defined as

k1=1.2, b=0.75

as default. However is there any way that to check the effect of using
different coefficients to calculate BM25 to find the optimal values?

Kind Regards,
Furkan KAMACI

Setting Shard Count at Initial Startup of SolrCloud

2016-12-12 Thread Furkan KAMACI

Hi,

I have an external Zookeeper. I don't wanna use SolrCloud as test. I upload
confs to Zookeeper:

server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181 -cmd upconfig
-confdir server/solr/my_collection/conf -confname my_collection

Start servers:

Server 1: bin/solr start -cloud -d server -p 8983 -z localhost:2181
Server 2: bin/solr start -cloud -d server -p 8984 -z localhost:2181

As usual, shard count will be 1 with this approach. I want 2 shards. I know
that I can create shard with:

bin/solr create

However, I have to delete existing collection and than I can create shards.
Is there any possibility to set number of shards and maximum shards per
node etc. at initial start of Solr?

Kind Regards,
Furkan KAMACI

Map Highlight Field into Another Field

2016-12-12 Thread Furkan KAMACI

Hi,

One can use * at highlight fields. As like:

content_*

So, content_de and content_en can match to it. However response will
include such fields:

"highlighting":{
"my query":{
  "content_de":
  "content_en":
...

Is it possible to map matched fields into a pre defined field. As like:

content_* => content

So, one can handle a generic name for such cases at response?

If not, I can implement such a feature.

Kind Regards,
Furkan KAMACI

Copying Tokens

2016-12-12 Thread Furkan KAMACI

Hi,

I'm testing language identification. I've enabled it solrconfig.xml.  Here
is my dynamic fields at schema:




So, after indexing, I see that fields are generated:

content_en
content_ru

I copy my fields into a text field:




Here is my text field:



I want to let users only search on only *text* field. However, when I copy
that fields into *text *field, they are indexed according to text_general.

How can I copy *tokens* to *text *field?

Kind Regards,
Furkan KAMACI

Re: Unicode Character Problem

2016-12-12 Thread Furkan KAMACI

Hi Ahmet,

I don't see any weird character when I manual copy it to any text editor.

On Sat, Dec 10, 2016 at 6:19 PM, Ahmet Arslan <iori...@yahoo.com.invalid>
wrote:

> Hi Furkan,
>
> I am pretty sure this is a pdf extraction thing.
> Turkish characters caused us trouble in the past during extracting text
> from pdf files.
> You can confirm by performing manual copy-paste from original pdf file.
>
> Ahmet
>
>
> On Friday, December 9, 2016 8:44 PM, Furkan KAMACI <furkankam...@gmail.com>
> wrote:
> Hi,
>
> I'm trying to index Turkish characters. These are what I see at my index (I
> see both of them at different places of my content):
>
> aç �klama
> açıklama
>
> These are same words but indexed different (same weird character at first
> one). I see that there is not a weird character when I check the original
> PDF file.
>
> What do you think about it. Is it related to Solr or Tika?
>
> PS: I use text_general for analyser of content field.
>
> Kind Regards,
> Furkan KAMACI
>

Unicode Character Problem

2016-12-09 Thread Furkan KAMACI

Hi,

I'm trying to index Turkish characters. These are what I see at my index (I
see both of them at different places of my content):

aç �klama
açıklama

These are same words but indexed different (same weird character at first
one). I see that there is not a weird character when I check the original
PDF file.

What do you think about it. Is it related to Solr or Tika?

PS: I use text_general for analyser of content field.

Kind Regards,
Furkan KAMACI

Re: LukeRequestHandler Error getting file length for [segments_1l]

2016-12-09 Thread Furkan KAMACI

No OOM, no corrupted index. Just a clean instal with few documents. Similar
to this:
http://lucene.472066.n3.nabble.com/NoSuchFileException-errors-common-on-version-5-5-0-td4263072.html

On Wed, Nov 30, 2016 at 3:19 AM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 11/29/2016 8:40 AM, halis Yılboğa wrote:
> > it is not normal to get that many error actually. Main problem should be
> > from your index. It seems to me your index is corrupted.
> >
> > 29 Kas 2016 Sal, 14:40 tarihinde, Furkan KAMACI <furkankam...@gmail.com>
> > şunu yazdı:
> >
> >> On the other hand, my Solr instance stops frequently due to such errors:
> >>
> >> 2016-11-29 12:25:36.962 WARN  (qtp1528637575-14) [   x:collection1]
> >> o.a.s.h.a.LukeRequestHandler Error getting file length for [segments_c]
> >> java.nio.file.NoSuchFileException: data/index/segments_c
>
> If your Solr instance is actually stopping, I would suspect the OOM
> script, assuming a non-windows system.  On non-windows systems, recent
> versions of Solr have a script that forcibly terminates Solr in the
> event of an OutOfMemoryError.  This script has its own log, which would
> be in the same place as solr.log.
>
> I've never heard of Solr actually crashing on a normally configured
> system, and I'm reasonably sure that the message you've indicated is not
> something that would cause a crash.  In fact, I've never seen it cause
> any real issues, just the warning message.
>
> Thanks,
> Shawn
>
>

Re: LukeRequestHandler Error getting file length for [segments_1l]

2016-11-29 Thread Furkan KAMACI

   ] o.e.j.s.ServerConnector
Stopped ServerConnector@3a52dba3{HTTP/1.1,[http/1.1]}{0.0.0.0:9983}
2016-11-29 12:26:03.870 INFO  (Thread-0) [   ] o.a.s.c.CoreContainer
Shutting down CoreContainer instance=226744878
2016-11-29 12:26:03.871 INFO  (coreCloseExecutor-12-thread-1) [
x:collection1] o.a.s.c.SolrCore [collection1]  CLOSING SolrCore
org.apache.solr.core.SolrCore@447dc7d4
2016-11-29 12:26:03.884 WARN  (Thread-0) [   ]
o.e.j.s.ServletContextHandler ServletContextHandler.setHandler should not
be called directly. Use insertHandler or setSessionHandler etc.

On Tue, Nov 29, 2016 at 1:15 PM, Furkan KAMACI <furkankam...@gmail.com>
wrote:

> I use Solr 6.3 and get too many warning about. Is it usual:
>
> WARN true LukeRequestHandler Error getting file length for [segments_1l]
> java.nio.file.NoSuchFileException: /home/server/solr/collection1/
> data/index/segments_1l
> at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
> at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
> at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
> at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(
> UnixFileAttributeViews.java:55)
> at sun.nio.fs.UnixFileSystemProvider.readAttributes(
> UnixFileSystemProvider.java:144)
> at sun.nio.fs.LinuxFileSystemProvider.readAttributes(
> LinuxFileSystemProvider.java:99)
> at java.nio.file.Files.readAttributes(Files.java:1737)
> at java.nio.file.Files.size(Files.java:2332)
> at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:243)
> at org.apache.lucene.store.NRTCachingDirectory.fileLength(
> NRTCachingDirectory.java:128)
> at org.apache.solr.handler.admin.LukeRequestHandler.getFileLength(
> LukeRequestHandler.java:598)
> at org.apache.solr.handler.admin.LukeRequestHandler.getIndexInfo(
> LukeRequestHandler.java:586)
> at org.apache.solr.handler.admin.LukeRequestHandler.handleRequestBody(
> LukeRequestHandler.java:137)
> at org.apache.solr.handler.RequestHandlerBase.handleRequest(
> RequestHandlerBase.java:153)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2213)
> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:460)
> at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:303)
> at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:254)
> at org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> doFilter(ServletHandler.java:1668)
> at org.eclipse.jetty.servlet.ServletHandler.doHandle(
> ServletHandler.java:581)
> at org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:143)
> at org.eclipse.jetty.security.SecurityHandler.handle(
> SecurityHandler.java:548)
> at org.eclipse.jetty.server.session.SessionHandler.
> doHandle(SessionHandler.java:226)
> at org.eclipse.jetty.server.handler.ContextHandler.
> doHandle(ContextHandler.java:1160)
> at org.eclipse.jetty.servlet.ServletHandler.doScope(
> ServletHandler.java:511)
> at org.eclipse.jetty.server.session.SessionHandler.
> doScope(SessionHandler.java:185)
> at org.eclipse.jetty.server.handler.ContextHandler.
> doScope(ContextHandler.java:1092)
> at org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:141)
> at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(
> ContextHandlerCollection.java:213)
> at org.eclipse.jetty.server.handler.HandlerCollection.
> handle(HandlerCollection.java:119)
> at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> HandlerWrapper.java:134)
> at org.eclipse.jetty.server.Server.handle(Server.java:518)
> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
> at org.eclipse.jetty.server.HttpConnection.onFillable(
> HttpConnection.java:244)
> at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(
> AbstractConnection.java:273)
> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
> SelectChannelEndPoint.java:93)
> at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> produceAndRun(ExecuteProduceConsume.java:246)
> at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(
> ExecuteProduceConsume.java:156)
> at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(
> QueuedThreadPool.java:654)
> at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(
> QueuedThreadPool.java:572)
> at java.lang.Thread.run(Thread.java:745)
>
> Kind Regards,
> Furkan KAMACI
>

LukeRequestHandler Error getting file length for [segments_1l]

2016-11-29 Thread Furkan KAMACI

I use Solr 6.3 and get too many warning about. Is it usual:

WARN true LukeRequestHandler Error getting file length for [segments_1l]
java.nio.file.NoSuchFileException:
/home/server/solr/collection1/data/index/segments_1l
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at
sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55)
at
sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
at
sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
at java.nio.file.Files.readAttributes(Files.java:1737)
at java.nio.file.Files.size(Files.java:2332)
at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:243)
at
org.apache.lucene.store.NRTCachingDirectory.fileLength(NRTCachingDirectory.java:128)
at
org.apache.solr.handler.admin.LukeRequestHandler.getFileLength(LukeRequestHandler.java:598)
at
org.apache.solr.handler.admin.LukeRequestHandler.getIndexInfo(LukeRequestHandler.java:586)
at
org.apache.solr.handler.admin.LukeRequestHandler.handleRequestBody(LukeRequestHandler.java:137)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:153)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2213)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:460)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:303)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:254)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:518)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
at java.lang.Thread.run(Thread.java:745)

Kind Regards,
Furkan KAMACI

Highlight is Empty for A Matched Query

2016-11-27 Thread Furkan KAMACI

My content has that line:

 \n \n\n Intelligent En

When I search for *intelligent *it returns 1 response as well. My content
field is defined as:

 

Highlighter is default too. I just make *highlight=on* and *hl.field=content
*However my response does not have any highlights. When I try with
different keywords: Some query keywords has highlight section and some of
them are not. What my be the problem for that? I didn't edit stopwords,
synonyms, etc.

Kind Regards,
Furkan KAMACI

Re: Metadata and Newline Characters at Content

2016-11-26 Thread Furkan KAMACI

PS: \n characters are not shown in browser but breaks how highlighter work.
 \n characters are considered at fragsize too.

On Sat, Nov 26, 2016 at 9:47 PM, Furkan KAMACI <furkankam...@gmail.com>
wrote:

> Hi Erick,
>
> I resolved my metadata problem with configuring solrconfig.xml However
> even I post data with post.sh I see content as like:
>
> CANADA �1 \n  \n \n   \n Place
>
> I have newline characters as \n and some non-ASCII characters. As far as I
> understand it is usual to have such characters because that is a pdf file
> and its newline characters are interpreted as *\n* at Solr. How can I
> remove them (\n and non-ASCII characters).
>
> Kind Regards,
> Furkan KAMACI
>
> On Thu, Nov 24, 2016 at 8:58 PM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
>> Not sure. What have you tried?
>>
>>  For production situations or when you want to take total control of
>> the indexing process,I strongly recommend that you put the Tika
>> parsing on the _client_.
>>
>> Here's a writeup on this topic:
>>
>> https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/
>>
>> Best,
>> Erick
>>
>> On Thu, Nov 24, 2016 at 10:37 AM, Furkan KAMACI <furkankam...@gmail.com>
>> wrote:
>> > Hi Erick,
>> >
>> > When I check the *Solr* documentation I see that [1]:
>> >
>> > *In addition to Tika's metadata, Solr adds the following metadata
>> (defined
>> > in ExtractingMetadataConstants):*
>> >
>> > *"stream_name" - The name of the ContentStream as uploaded to Solr.
>> > Depending on how the file is uploaded, this may or may not be set.*
>> > *"stream_source_info" - Any source info about the stream. See
>> > ContentStream.*
>> > *"stream_size" - The size of the stream in bytes(?)*
>> > *"stream_content_type" - The content type of the stream, if available.*
>> >
>> > So, it seems that these may not be added by Tika, but Solr. Do you know
>> how
>> > to enable/disable this feature?
>> >
>> > Kind Regards,
>> > Furkan KAMACI
>> >
>> > [1] https://wiki.apache.org/solr/ExtractingRequestHandler
>> >
>> > On Thu, Nov 24, 2016 at 6:51 PM, Erick Erickson <
>> erickerick...@gmail.com>
>> > wrote:
>> >
>> >> about PatternCaptureGroupFilterFactory. This isn't going to help. The
>> >> data you see when you return stored data is _before_ any analysis so
>> >> the PatternFactory won't be applied. You could do this in a
>> >> ScriptUpdateProcessorFactory. Or, just don't worry about it and have
>> >> the real app deal with it.
>> >>
>> >> I don't particularly know about the Tika settings, that's largely a
>> guess.
>> >>
>> >> Best,
>> >> Erick
>> >>
>> >> On Thu, Nov 24, 2016 at 8:43 AM, Furkan KAMACI <furkankam...@gmail.com
>> >
>> >> wrote:
>> >> > Hi Erick,
>> >> >
>> >> > 1) I am looking stored data via Solr Admin UI. I send the query and
>> check
>> >> > what is in content field.
>> >> >
>> >> > 2) I can debug the Tika settings if you think that this is not the
>> >> desired
>> >> > behaviour to have such metadata fields combined into content field.
>> >> >
>> >> > *PS: *Is there any solution to get rid of it except for
>> >> > using PatternCaptureGroupFilterFactory?
>> >> >
>> >> > Kind Regards,
>> >> > Furkan KAMACI
>> >> >
>> >> > On Thu, Nov 24, 2016 at 6:31 PM, Erick Erickson <
>> erickerick...@gmail.com
>> >> >
>> >> > wrote:
>> >> >
>> >> >> 1> I'm assuming when you "see" this data you're looking at the
>> stored
>> >> >> data, right? It's a verbatim copy of whatever you sent to the field.
>> >> >> I'm guessing it's a character-encoding mismatch between the source
>> and
>> >> >> what you use to display.
>> >> >>
>> >> >> 2> How are you extracting this data? There are Tika options I think
>> >> >> that can/do mush fields together.
>> >> >>
>> >> >> Best,
>> >> >> Erick
>> >> >>
>> >> >>
>> >> >>
>> >> >> On Thu, Nov 24, 2016 at 7:54 AM, Furkan KAMACI <
>> furkankam...@gmail.com>
>> >> >> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> > I'm testing Solr 4.9.1 I've indexed documents via it. Content
>> field at
>> >> >> > schema has text_general field type which is not modified from
>> >> original. I
>> >> >> > do not copy any fields to content. When I check the data  I see
>> >> content
>> >> >> > values as like:
>> >> >> >
>> >> >> >  " \n \nstream_source_info MARLON BRANDO.rtf
>>  \nstream_content_type
>> >> >> > application/rtf   \nstream_size 13580   \nstream_name MARLON
>> >> BRANDO.rtf
>> >> >> > \nContent-Type application/rtf   \nresourceName MARLON
>> BRANDO.rtf   \n
>> >> >> \n
>> >> >> > \n  1. Vivien Leigh and Marlon Brando in \"A Streetcar Named
>> Desire\"
>> >> >> > directed by Elia Kazan \n"
>> >> >> >
>> >> >> > My questions:
>> >> >> >
>> >> >> > 1) Is it usual to have that newline characters?
>> >> >> > 2) Is it usual to have file metadata at the beginning of the
>> content
>> >> >> (i.e.
>> >> >> > stream source, stream_content_type) or related to tool that I post
>> >> data
>> >> >> to
>> >> >> > Solr?
>> >> >> >
>> >> >> > Kind Regards,
>> >> >> > Furkan KAMACI
>> >> >>
>> >>
>>
>
>

Re: Metadata and Newline Characters at Content

2016-11-26 Thread Furkan KAMACI

Hi Erick,

I resolved my metadata problem with configuring solrconfig.xml However even
I post data with post.sh I see content as like:

CANADA �1 \n  \n \n   \n Place

I have newline characters as \n and some non-ASCII characters. As far as I
understand it is usual to have such characters because that is a pdf file
and its newline characters are interpreted as *\n* at Solr. How can I
remove them (\n and non-ASCII characters).

Kind Regards,
Furkan KAMACI

On Thu, Nov 24, 2016 at 8:58 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Not sure. What have you tried?
>
>  For production situations or when you want to take total control of
> the indexing process,I strongly recommend that you put the Tika
> parsing on the _client_.
>
> Here's a writeup on this topic:
>
> https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/
>
> Best,
> Erick
>
> On Thu, Nov 24, 2016 at 10:37 AM, Furkan KAMACI <furkankam...@gmail.com>
> wrote:
> > Hi Erick,
> >
> > When I check the *Solr* documentation I see that [1]:
> >
> > *In addition to Tika's metadata, Solr adds the following metadata
> (defined
> > in ExtractingMetadataConstants):*
> >
> > *"stream_name" - The name of the ContentStream as uploaded to Solr.
> > Depending on how the file is uploaded, this may or may not be set.*
> > *"stream_source_info" - Any source info about the stream. See
> > ContentStream.*
> > *"stream_size" - The size of the stream in bytes(?)*
> > *"stream_content_type" - The content type of the stream, if available.*
> >
> > So, it seems that these may not be added by Tika, but Solr. Do you know
> how
> > to enable/disable this feature?
> >
> > Kind Regards,
> > Furkan KAMACI
> >
> > [1] https://wiki.apache.org/solr/ExtractingRequestHandler
> >
> > On Thu, Nov 24, 2016 at 6:51 PM, Erick Erickson <erickerick...@gmail.com
> >
> > wrote:
> >
> >> about PatternCaptureGroupFilterFactory. This isn't going to help. The
> >> data you see when you return stored data is _before_ any analysis so
> >> the PatternFactory won't be applied. You could do this in a
> >> ScriptUpdateProcessorFactory. Or, just don't worry about it and have
> >> the real app deal with it.
> >>
> >> I don't particularly know about the Tika settings, that's largely a
> guess.
> >>
> >> Best,
> >> Erick
> >>
> >> On Thu, Nov 24, 2016 at 8:43 AM, Furkan KAMACI <furkankam...@gmail.com>
> >> wrote:
> >> > Hi Erick,
> >> >
> >> > 1) I am looking stored data via Solr Admin UI. I send the query and
> check
> >> > what is in content field.
> >> >
> >> > 2) I can debug the Tika settings if you think that this is not the
> >> desired
> >> > behaviour to have such metadata fields combined into content field.
> >> >
> >> > *PS: *Is there any solution to get rid of it except for
> >> > using PatternCaptureGroupFilterFactory?
> >> >
> >> > Kind Regards,
> >> > Furkan KAMACI
> >> >
> >> > On Thu, Nov 24, 2016 at 6:31 PM, Erick Erickson <
> erickerick...@gmail.com
> >> >
> >> > wrote:
> >> >
> >> >> 1> I'm assuming when you "see" this data you're looking at the stored
> >> >> data, right? It's a verbatim copy of whatever you sent to the field.
> >> >> I'm guessing it's a character-encoding mismatch between the source
> and
> >> >> what you use to display.
> >> >>
> >> >> 2> How are you extracting this data? There are Tika options I think
> >> >> that can/do mush fields together.
> >> >>
> >> >> Best,
> >> >> Erick
> >> >>
> >> >>
> >> >>
> >> >> On Thu, Nov 24, 2016 at 7:54 AM, Furkan KAMACI <
> furkankam...@gmail.com>
> >> >> wrote:
> >> >> > Hi,
> >> >> >
> >> >> > I'm testing Solr 4.9.1 I've indexed documents via it. Content
> field at
> >> >> > schema has text_general field type which is not modified from
> >> original. I
> >> >> > do not copy any fields to content. When I check the data  I see
> >> content
> >> >> > values as like:
> >> >> >
> >> >> >  " \n \nstream_source_info MARLON BRANDO.rtf
>  \nstream_content_type
> >> >> > application/rtf   \nstream_size 13580   \nstream_name MARLON
> >> BRANDO.rtf
> >> >> > \nContent-Type application/rtf   \nresourceName MARLON BRANDO.rtf
>  \n
> >> >> \n
> >> >> > \n  1. Vivien Leigh and Marlon Brando in \"A Streetcar Named
> Desire\"
> >> >> > directed by Elia Kazan \n"
> >> >> >
> >> >> > My questions:
> >> >> >
> >> >> > 1) Is it usual to have that newline characters?
> >> >> > 2) Is it usual to have file metadata at the beginning of the
> content
> >> >> (i.e.
> >> >> > stream source, stream_content_type) or related to tool that I post
> >> data
> >> >> to
> >> >> > Solr?
> >> >> >
> >> >> > Kind Regards,
> >> >> > Furkan KAMACI
> >> >>
> >>
>

ClassicIndexSchemaFactory with Solr 6.3

2016-11-26 Thread Furkan KAMACI

Hi,

I'm trying Solr 6.3. I don't want to use Managed Schema. It was OK for Solr
5.x. However solrconfig.xml of Solr 6.3 doesn't have a
ManagedIndexSchemaFactory definition. Documentation is wrong at this point (
https://cwiki.apache.org/confluence/display/solr/Schema+Factory+Definition+in+SolrConfig
)

How can I use ClassicIndexSchemaFactory with Solr 6.3?

Kind Regards,
Furkan KAMACI

Re: Metadata and Newline Characters at Content

2016-11-24 Thread Furkan KAMACI

Hi Erick,

When I check the *Solr* documentation I see that [1]:

*In addition to Tika's metadata, Solr adds the following metadata (defined
in ExtractingMetadataConstants):*

*"stream_name" - The name of the ContentStream as uploaded to Solr.
Depending on how the file is uploaded, this may or may not be set.*
*"stream_source_info" - Any source info about the stream. See
ContentStream.*
*"stream_size" - The size of the stream in bytes(?)*
*"stream_content_type" - The content type of the stream, if available.*

So, it seems that these may not be added by Tika, but Solr. Do you know how
to enable/disable this feature?

Kind Regards,
Furkan KAMACI

[1] https://wiki.apache.org/solr/ExtractingRequestHandler

On Thu, Nov 24, 2016 at 6:51 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> about PatternCaptureGroupFilterFactory. This isn't going to help. The
> data you see when you return stored data is _before_ any analysis so
> the PatternFactory won't be applied. You could do this in a
> ScriptUpdateProcessorFactory. Or, just don't worry about it and have
> the real app deal with it.
>
> I don't particularly know about the Tika settings, that's largely a guess.
>
> Best,
> Erick
>
> On Thu, Nov 24, 2016 at 8:43 AM, Furkan KAMACI <furkankam...@gmail.com>
> wrote:
> > Hi Erick,
> >
> > 1) I am looking stored data via Solr Admin UI. I send the query and check
> > what is in content field.
> >
> > 2) I can debug the Tika settings if you think that this is not the
> desired
> > behaviour to have such metadata fields combined into content field.
> >
> > *PS: *Is there any solution to get rid of it except for
> > using PatternCaptureGroupFilterFactory?
> >
> > Kind Regards,
> > Furkan KAMACI
> >
> > On Thu, Nov 24, 2016 at 6:31 PM, Erick Erickson <erickerick...@gmail.com
> >
> > wrote:
> >
> >> 1> I'm assuming when you "see" this data you're looking at the stored
> >> data, right? It's a verbatim copy of whatever you sent to the field.
> >> I'm guessing it's a character-encoding mismatch between the source and
> >> what you use to display.
> >>
> >> 2> How are you extracting this data? There are Tika options I think
> >> that can/do mush fields together.
> >>
> >> Best,
> >> Erick
> >>
> >>
> >>
> >> On Thu, Nov 24, 2016 at 7:54 AM, Furkan KAMACI <furkankam...@gmail.com>
> >> wrote:
> >> > Hi,
> >> >
> >> > I'm testing Solr 4.9.1 I've indexed documents via it. Content field at
> >> > schema has text_general field type which is not modified from
> original. I
> >> > do not copy any fields to content. When I check the data  I see
> content
> >> > values as like:
> >> >
> >> >  " \n \nstream_source_info MARLON BRANDO.rtf   \nstream_content_type
> >> > application/rtf   \nstream_size 13580   \nstream_name MARLON
> BRANDO.rtf
> >> > \nContent-Type application/rtf   \nresourceName MARLON BRANDO.rtf   \n
> >> \n
> >> > \n  1. Vivien Leigh and Marlon Brando in \"A Streetcar Named Desire\"
> >> > directed by Elia Kazan \n"
> >> >
> >> > My questions:
> >> >
> >> > 1) Is it usual to have that newline characters?
> >> > 2) Is it usual to have file metadata at the beginning of the content
> >> (i.e.
> >> > stream source, stream_content_type) or related to tool that I post
> data
> >> to
> >> > Solr?
> >> >
> >> > Kind Regards,
> >> > Furkan KAMACI
> >>
>

Re: Metadata and Newline Characters at Content

2016-11-24 Thread Furkan KAMACI

Hi Erick,

1) I am looking stored data via Solr Admin UI. I send the query and check
what is in content field.

2) I can debug the Tika settings if you think that this is not the desired
behaviour to have such metadata fields combined into content field.

*PS: *Is there any solution to get rid of it except for
using PatternCaptureGroupFilterFactory?

Kind Regards,
Furkan KAMACI

On Thu, Nov 24, 2016 at 6:31 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> 1> I'm assuming when you "see" this data you're looking at the stored
> data, right? It's a verbatim copy of whatever you sent to the field.
> I'm guessing it's a character-encoding mismatch between the source and
> what you use to display.
>
> 2> How are you extracting this data? There are Tika options I think
> that can/do mush fields together.
>
> Best,
> Erick
>
>
>
> On Thu, Nov 24, 2016 at 7:54 AM, Furkan KAMACI <furkankam...@gmail.com>
> wrote:
> > Hi,
> >
> > I'm testing Solr 4.9.1 I've indexed documents via it. Content field at
> > schema has text_general field type which is not modified from original. I
> > do not copy any fields to content. When I check the data  I see content
> > values as like:
> >
> >  " \n \nstream_source_info MARLON BRANDO.rtf   \nstream_content_type
> > application/rtf   \nstream_size 13580   \nstream_name MARLON BRANDO.rtf
> > \nContent-Type application/rtf   \nresourceName MARLON BRANDO.rtf   \n
> \n
> > \n  1. Vivien Leigh and Marlon Brando in \"A Streetcar Named Desire\"
> > directed by Elia Kazan \n"
> >
> > My questions:
> >
> > 1) Is it usual to have that newline characters?
> > 2) Is it usual to have file metadata at the beginning of the content
> (i.e.
> > stream source, stream_content_type) or related to tool that I post data
> to
> > Solr?
> >
> > Kind Regards,
> > Furkan KAMACI
>

Metadata and Newline Characters at Content

2016-11-24 Thread Furkan KAMACI

Hi,

I'm testing Solr 4.9.1 I've indexed documents via it. Content field at
schema has text_general field type which is not modified from original. I
do not copy any fields to content. When I check the data  I see content
values as like:

 " \n \nstream_source_info MARLON BRANDO.rtf   \nstream_content_type
application/rtf   \nstream_size 13580   \nstream_name MARLON BRANDO.rtf
\nContent-Type application/rtf   \nresourceName MARLON BRANDO.rtf   \n  \n
\n  1. Vivien Leigh and Marlon Brando in \"A Streetcar Named Desire\"
directed by Elia Kazan \n"

My questions:

1) Is it usual to have that newline characters?
2) Is it usual to have file metadata at the beginning of the content (i.e.
stream source, stream_content_type) or related to tool that I post data to
Solr?

Kind Regards,
Furkan KAMACI

Overlapped Gap Facets

2016-11-17 Thread Furkan KAMACI

Is it possible to do such a facet on a date field:

 Last 1 Day
 Last 1 Week
 Last 1 Month
 Last 6 Month
 Last 1 Year
 Older than 1 Year

which has overlapped facet gaps?

Kind Regards,
Furkan KAMACI

Re: Aggregate Values Inside a Facet Range

2016-11-04 Thread Furkan KAMACI

Yes, it works with hours too. You can run a sum function each hour facet
which is named as bucket.

On Nov 4, 2016 10:14 PM, "William Bell" <billnb...@gmail.com> wrote:

> How about hours?
>
> NOW+1HR
> NOW+2HR
> NOW+12HR
> NOW-4HR
>
> Can we add that?
>
>
> On Fri, Nov 4, 2016 at 12:25 PM, Furkan KAMACI <furkankam...@gmail.com>
> wrote:
>
> > I have documents like that
> >
> > id:5
> > timestamp:NOW //pseudo date representation
> > count:13
> >
> > id:4
> > timestamp:NOW //pseudo date representation
> > count:3
> >
> > id:3
> > timestamp:NOW-1DAY //pseudo date representation
> > count:21
> >
> > id:2
> > timestamp:NOW-1DAY //pseudo date representation
> > count:29
> >
> > id:1
> > timestamp:NOW-3DAY //pseudo date representation
> > count:4
> >
> > When I want to facet last 3 days data by timestamp its OK. However my
> need
> > is that:
> >
> > facets:
> > TODAY: 16 //pseudo representation
> > TODAY - 1: 50 //pseudo date representation
> > TODAY - 2: 0 //pseudo date representation
> > TODAY - 3: 4 //pseudo date representation
> >
> > I mean, I have to facet by dates and aggregate values inside that facet
> > range. Is it possible to do that without multiple queries at Solr?
> >
> > Kind Regards,
> > Furkan KAMACI
> >
>
>
>
> --
> Bill Bell
> billnb...@gmail.com
> cell 720-256-8076
>

Re: Aggregate Values Inside a Facet Range

2016-11-04 Thread Furkan KAMACI

Seems that Solrj doesn't support JSON Facet API yet.

On Fri, Nov 4, 2016 at 9:08 PM, Furkan KAMACI <furkankam...@gmail.com>
wrote:

> Fantastic! Thanks Yonik, I could do the stuff that I want with JSON Facet
> API.
>
> On Fri, Nov 4, 2016 at 8:42 PM, Yonik Seeley <ysee...@gmail.com> wrote:
>
>> On Fri, Nov 4, 2016 at 2:25 PM, Furkan KAMACI <furkankam...@gmail.com>
>> wrote:
>> > I mean, I have to facet by dates and aggregate values inside that facet
>> > range. Is it possible to do that without multiple queries at Solr?
>>
>> This (old) blog shows a percentiles calculation under a range facet:
>> http://yonik.com/percentiles-for-solr-faceting/
>>
>> -Yonik
>>
>
>

1 2 3 4 5 6 7 8 >

1 - 100 of 746 matches

Mail list logo