Re: [ANNOUNCE] Apache Solr 8.8.1 released

2021-02-27 Thread David Smiley
The corresponding docker image has been released as well: https://hub.docker.com/_/solr (credit to Tobias Kässmann for helping) ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Tue, Feb 23, 2021 at 10:39 AM Timothy Potter wrote: > The Lucene

Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-19 Thread David Smiley
at ends up being LazyField if you have that feature enabled, or possible wasted space if you don't have that enabled. So I don't think the ability to exclude fields in "fl" would obsolete enableLazyFieldLoading which I think you are implying? ~ David Smiley Apache Lucene/Solr S

Re: Congratulations to the new Apache Solr PMC Chair, Jan Høydahl!

2021-02-18 Thread David Smiley
Congratulations Jan! ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Thu, Feb 18, 2021 at 1:56 PM Anshum Gupta wrote: > Hi everyone, > > I’d like to inform everyone that the newly formed Apache Solr PMC nominated > and elected

Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-18 Thread David Smiley
ional issue > here because it happens only when id field contains an underscore (didn't > check for other special characters). > Currently I have no other choice but to use enableLazyFieldLoading=false. > I hope it wouldn't have a significant performance impact. > > -----Original Mess

Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-17 Thread David Smiley
a query that only returns the "id" field. No highlighting. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Wed, Feb 17, 2021 at 10:28 AM David Smiley wrote: > Thanks for more details. I was able to reproduce this locally! I hacked >

Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-17 Thread David Smiley
. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Wed, Feb 17, 2021 at 6:36 AM Nussbaum, Ronen wrote: > Hello David, > > Thank you for your reply. > It was very hard but finally I discovered how to reproduce it. I thought > of i

Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index

2021-02-14 Thread David Smiley
ata; maybe that can illustrate the problem? It's not clear if nested schema or nested docs are actually required in your example. If you share the JIRA issue with me, I'll chase this one down. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Sun, Fe

Re: Incorrect distance returned for indexed polygone shape

2021-01-31 Thread David Smiley
enough for what you want to do. Basically, calculate the geodist but subtract the radius field... maybe something like this (untested!): sort=sub(geodist(),radius) desc. Use LatLonPointSpatialField to store point data if you can (is appropriate), which succeeded RPT for that. ~ David Smiley

Re: Performance issue with Solr 8.6.1 Unified Highlighter does not occur on Solr 6.

2021-01-29 Thread David Smiley
ecause hl.requireFieldMatch=false is the default, doesn't mean it's the _right_ choice for everyone's app :-). I tend to think Solr should flip this in 9.0 for both accuracy & performance sake. And unset hl.maxAnalyzedChars -- mostly an obsolete safety with the UH being so much faster. ~ David Smi

Re: Performance issue with Solr 8.6.1 Unified Highlighter does not occur on Solr 6.

2021-01-28 Thread David Smiley
). ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Wed, Jan 27, 2021 at 2:20 AM Kerwin wrote: > Hi, > > While upgrading to Solr 8 from 6 the Unified highlighter begins to have > performance issues going from approximately 100ms to more th

Re: Exact and non exact highlighting

2021-01-22 Thread David Smiley
Solr schema. If you are up for it, comment on that issue to let the original contributor know you want to help move this forward. Maybe they do too. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Fri, Jan 22, 2021 at 12:46 PM df2832368_...@am

Re: Highlighting large text fields

2021-01-12 Thread David Smiley
likely to not highlight as much as you are highlighting now, and highlighting more is your goal right now it appears. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Tue, Jan 12, 2021 at 2:45 PM Shaun Campbell wrote: > That's great David.

Re: Highlighting large text fields

2021-01-12 Thread David Smiley
, and I haven't investigated it yet. ~ David > > Thanks > Shaun > > On Tue, 12 Jan 2021 at 16:30, David Smiley wrote: > > > On Tue, Jan 12, 2021 at 9:39 AM Shaun Campbell > > > wrote: > > > > > Hi David > > > > > > First of

Re: Highlighting large text fields

2021-01-12 Thread David Smiley
to each request? > You can set highlighting and other *parameters* in solrconfig.xml for request handlers. But the dedicated plugin info is only for the original and Fast Vector Highlighters. ~ David > > Thanks > Shaun > > On Mon, 11 Jan 2021 at 20:57, David Smiley wrote: > >

Re: Highlighting large text fields

2021-01-11 Thread David Smiley
-index) -- storeOffsetsWithPositions. That's an option on the field/fieldType in your schema; it may not be obvious reading the docs. You have to opt-in to that; Solr doesn't normally store any info in the index for highlighting. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley

Re: SPLITSHARD - data loss of child documents

2020-12-19 Thread David Smiley
https://issues.apache.org/jira/browse/SOLR-11191 and I assigned it to myself just now. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Thu, Dec 17, 2020 at 9:50 AM Mike Drob wrote: > I was under the impression that split shard doesn’t w

Re: data import handler deprecated?

2020-11-30 Thread David Smiley
ce of news / release notes), the functionality has *moved*. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Mon, Nov 30, 2020 at 8:04 AM Eric Pugh wrote: > You don’t need to abandon DIH right now…. You can just use the Github > hosted vers

Re: Faceting: !terms vs mincount precedence

2020-11-17 Thread David Smiley
ect answer to your question RE mincount... perhaps it can be made to work? ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Tue, Nov 17, 2020 at 8:21 AM Jason Gerlowski wrote: > Hey all, > > I was using the {!terms} local parameter on some

Re: [ANNOUNCE] Apache Solr 8.7.0 released

2020-11-09 Thread David Smiley
FYI an updated Docker image was just published a few hours ago: https://hub.docker.com/_/solr ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Wed, Nov 4, 2020 at 9:06 AM Atri Sharma wrote: > 3/11/2020, Apache Solr™ 8.7 available > > The L

Re: Solr 8.6.3

2020-10-22 Thread David Smiley
ted the warning about this in 8.7, so you won't see that again. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Thu, Oct 15, 2020 at 4:13 PM Kris Gurusamy wrote: > I've just downloaded solr 8.6.3 and trying to create DIH for loading > structured XML

HEY, are you using the Analytics contrib?

2020-09-03 Thread David Smiley
Solr maintainers continue to maintain it. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley

Re: What is the Best way to block certain types of queries/ query patterns in Solr?

2020-09-03 Thread David Smiley
to support arbitrary parameters you pass to Solr as-is that you don't know about in advance (i.e. use an allow-list). ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Mon, Aug 31, 2020 at 10:57 AM Mark Robinson wrote: > Hi, > I had come across a mai

Re: Error on searches containing specific character pattern

2020-09-03 Thread David Smiley
/lucene/core/src/java/org/apache/lucene/util/QueryBuilder.java#L653 If you can reproduce this with the "techproducts" schema, please share the complete query. If there's a problem here, I suspect the synonyms you have may be pertinent. ~ David Smiley Apache Lucene/Solr Search Deve

[CVE-2020-13941] Apache Solr information disclosure vulnerability

2020-08-14 Thread David Smiley
to trusted paths * Prevent remote connection when using Windows UNC Paths ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley

Re: org.apache.lucene.util.fst.FST taking up lot of Java Heap Memory

2020-08-07 Thread David Smiley
e probably not using Solr 8.4.0 or beyond, which moved to having the FSTs off-heap -- at least the ones associated with the field indexes. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Thu, Aug 6, 2020 at 8:19 PM sanjay dutt wrote: > Fiel

Re: org.apache.lucene.util.fst.FST taking up lot of Java Heap Memory

2020-08-05 Thread David Smiley
What is the Solr field type definition for this field? And what sort of spatial data do you add here -- just points or what? ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Mon, Aug 3, 2020 at 10:09 PM sanjay dutt wrote: > Hello Solr commun

Re: Out of memory errors with Spatial indexing

2020-07-06 Thread David Smiley
I believe you are experiencing this bug: LUCENE-5056 <https://issues.apache.org/jira/browse/LUCENE-5056> The fix would probably be adjusting code in here org.apache.lucene.spatial.query.SpatialArgs#calcDistanceFromErrPct ~ David Smiley Apache Lucene/Solr Search Developer http://www.linked

Re: unified highlighter performance in solr 8.5.1

2020-07-05 Thread David Smiley
rue as default. > > On 7/4/20, David Smiley wrote: > > I doubt that WORD mode is impacted much by hl.fragsizeIsMinimum in terms > of > > quality of the highlight since there are vastly more breaks to pick from. > > I think that setting is more useful in SENTENCE mode if

Re: unified highlighter performance in solr 8.5.1

2020-07-03 Thread David Smiley
I doubt that WORD mode is impacted much by hl.fragsizeIsMinimum in terms of quality of the highlight since there are vastly more breaks to pick from. I think that setting is more useful in SENTENCE mode if you can stand the perf hit. If you agree, then why not just let this one default to "true"?

Re: Out of memory errors with Spatial indexing

2020-07-03 Thread David Smiley
="solr.RptWithGeometrySpatialField" which internally is based off a combination of a course grid and storing the original vector geometry for accurate verification: The internally coarser grid will lessen the impact of that pole bug. ~ David Smiley Apache Lucene/Solr Search Deve

Re: unified highlighter performance in solr 8.5.1

2020-07-03 Thread David Smiley
I think we should flip the default of hl.fragsizeIsMinimum to be 'true', thus have the behavior close to what preceded 8.5. (a) it was very recently (<= 8.4) the previous behavior and so may require less tuning for users in 8.6 henceforth (b) it's significantly faster for long text -- seems to be

Re: Master Slave Terminology

2020-06-17 Thread David Smiley
priv...@lucene.apache.org but it should have been public and expect it to spill out to the dev list today. ~ David On Wed, Jun 17, 2020 at 11:14 AM Mike Drob wrote: > Hi Jan, > > Can you link to the discussion? I searched the dev list and didn’t see > anything, is it on slack or a jira or

Re: Facet Performance

2020-06-17 Thread David Smiley
I strongly recommend setting indexed=true on a field you facet on for the purposes of efficient refinement (fq=field:value). But it strictly isn't required, as you have discovered. ~ David On Wed, Jun 17, 2020 at 9:02 AM Michael Gibney wrote: > facet.method=enum works by executing a query

Re: Why Did It Match?

2020-05-29 Thread David Smiley
I've used the highlighter in the past for this but it has to do a lot more work than "explain". Typically that extra work is analysis of the fields' text again. Still; the highlighter can make sense when the individual fields aren't otherwise searchable because you are searching on an aggregate

Re: unified highlighter performance in solr 8.5.1

2020-05-27 Thread David Smiley
; > > On utorok 26. mája 2020 17:44:52 CEST David Smiley wrote: > > > Please create an issue. I haven't reproduced it yet but it seems > unlikely > > > to be user-error. > > > > > > ~ David > > > > > > > > > On M

Re: unified highlighter performance in solr 8.5.1

2020-05-26 Thread David Smiley
Please create an issue. I haven't reproduced it yet but it seems unlikely to be user-error. ~ David On Mon, May 25, 2020 at 9:28 AM Michal Hlavac wrote: > Hi, > > I have field: > stored="true" indexed="false" storeOffsetsWithPositions="true"/> > > and configuration: > true > unified > true

Re: unified highlighter performance in solr 8.5.1

2020-05-25 Thread David Smiley
Wow that's terrible! So this problem is for SENTENCE in particular, and it's a regression in 8.5? I'll see if I can reproduce this with the Lucene benchmark module. I figure you have some meaty text, like "page" size or longer? ~ David On Mon, May 25, 2020 at 10:38 AM Michal Hlavac wrote: >

Re: highlighting a whole html document using Unified highlighter

2020-05-24 Thread David Smiley
t as html document ? > (preserving the field data coming from meta-tags and not strip the html > tags) > > Then I could use solr.HTMLStripCharFilterFactory for analysis. > > Thank You, > > Serkan, > > > > > -Original Message- > From: David Smi

Re: highlighting a whole html document using Unified highlighter

2020-05-24 Thread David Smiley
roblem, and the root cause is here: LUCENE-5734 <https://issues.apache.org/jira/browse/LUCENE-5734> It's on my long TODO list but hasn't bitten me lately so I've neglected it. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Sun, May 24, 2020

Re: hl.preserveMulti in Unified highlighter?

2020-05-23 Thread David Smiley
formance hit from > > essentially removing the offset usage, but our highlighted fields aren't > > extremely large :-) > > > > Hope that helps! > > Anthony > > > > *Anthony Groves* | Technical Lead, Search > > > > O'Reilly Media, Inc. | https://www.link

Re: Creating custom PassageFormatter

2020-05-22 Thread David Smiley
You've probably gotten you answer now but "no". Basically, you'd need to specify your own subclass of UnifiedSolrHighlighter in solrconfig.xml like this: Error loading class 'solr.highlight.CustomPassageFormatter'". > > Example from solrconfig.xml: >

Re: hl.preserveMulti in Unified highlighter?

2020-05-22 Thread David Smiley
Hi Walter, No, the UnifiedHighlighter does not behave as if this setting were true. The docs say: `hl.preserveMulti`:: If `true`, multi-valued fields will return all values in the order they were saved in the index. If `false`, the default, only values that match the highlight request will be

Re: Alternate Fields for Unified Highlighter

2020-05-22 Thread David Smiley
f Solr had a DocTransformer to accomplish that. I know it's been awhile; I'm curious how the UH has been working for you, assuming you are using it. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Sun, Jun 2, 2019 at 6:47 AM Furkan KAMACI wrote:

Re: unified highlighter methods works unexpected

2020-05-22 Thread David Smiley
/solr/techproducts/select?defType=edismax=id%2Cname=name=unified=on=3%3C74%25=%22hard%20dri%22=name%20text=true=0.1 If you could help me in telling me reproducibility instructions with tech_products, then I can help diagnose the underlying problem and possibly fix. ~ David Smiley Apache Lucene/Solr

Re: Unified highlighter with storeOffsetsWithPositions and termVectors giving an exception

2020-05-22 Thread David Smiley
FWIW I tried this on the techproducts schema with a modification to the name field, but did not see the issue. I suspect you did not re-index after making these schema changes. If you did, then also check that the collection (or core) truly started fresh (never had any previous schema) because

Re: Highlighting Solr 8

2020-05-22 Thread David Smiley
What did you end up doing, Eric? Did you migrate to the Unified Highlighter? ~ David On Wed, Oct 16, 2019 at 4:36 PM Eric Allen wrote: > Thanks for the reply. > > Currently we are migrating from solr4 to solr8 under solr 4 we wrote our > own highlighter because the provided one was too slow

Re: Unified highlighter- unable to get results - can get results with original and termvector highlighters

2020-05-22 Thread David Smiley
Hello, Did you get it to work eventually? Try setting hl.weightMatches=false and see if that helps. Wether this helps or not, I'd like to have a deeper understanding of the internal structure of the Query (not the original query string). What query parser are you using?. If you pass

Re: Syntax error while parsing Spatial Query as string

2020-02-14 Thread David Smiley
as it's obsolete. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Fri, Feb 14, 2020 at 6:47 AM vas aj wrote: > Hi team, > > I am using Lucene 6.6.2, Spatial4j 0.7, lucene-spatial-extras 6.6.2. I am > trying to create a Spatial

Re: Dependency log4j-slf4j-impl for solr-core:7.5.0 causing a number of build problems

2020-01-16 Thread David Smiley
Ultimately if you deduce the problem, file a JIRA issue and share it with me; I will look into it. I care about this matter too; I hate having to exclude logging dependencies on the consuming end. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Wed

Re: Solr spatial search - overlapRatio of polygons

2020-01-08 Thread David Smiley
an 8, 2020 at 1:16 PM David Smiley wrote: > My response to a direct email (copying here with permission): > > It's possible; you'll certainly have to write some code here to make this > work, including some new Solr plugin; perhaps ValueSourceParser that can > compute a more accura

Fwd: Solr spatial search - overlapRatio of polygons

2020-01-08 Thread David Smiley
://lucene.apache.org/solr/guide/8_3/query-re-ranking.html -- Forwarded message - From: Marc Date: Tue, Jan 7, 2020 at 6:14 AM Subject: Solr spatial search - overlapRatio of polygons To: David Smiley Dear Mr Smiley, I have a tricky question concerning the spatial search features

Re: [ANNOUNCE] Apache Solr 8.3.1 released

2019-12-09 Thread David Smiley
Thanks. I observe we too often write in that way and leave it up to the reader to assume we don’t intentionally add bugs :-) On Mon, Dec 9, 2019 at 5:45 AM Colvin Cowie wrote: > Oh, just looking at the way the announcement reads on > http://lucene.apache.org/solr/news.html : > Solr 8.3.1

Re: Re: Need urgent help with Solr spatial search using SpatialRecursivePrefixTreeFieldType

2019-10-01 Thread David Smiley
'. This is a syntax parsing gotcha that has to do with how embedded queries are parsed, which is what you need to do as you need to compose two with an operator. It'd be kinda awkard to fix that gotcha in Solr. There are other techniques too, but this is the most succinct. ~ David Smiley Apache Lucene

Re: Re: Need urgent help with Solr spatial search using SpatialRecursivePrefixTreeFieldType

2019-09-30 Thread David Smiley
(e.g. min/max/sum). ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Mon, Sep 30, 2019 at 10:22 AM Anushka Gupta < anushka_gu...@external.mckinsey.com> wrote: > Hi, > > > > I want to be able to filter on different cities an

Re: Solr Backup restore

2019-09-13 Thread David Smiley
It would help if you could devise a simple set of command line steps to reproduce/demonstrate the problem using the "bin/solr -e solrcloud" setup. The problem you see ought to be reproducible here if there is a problem. ~ David Smiley Apache Lucene/Solr Search Developer http://www.li

Re: Migrating Bounding box from Lucene to Solr

2019-09-09 Thread David Smiley
, is in the UK. It's also unclear what field type you are using. If you have a polygon then use RptWithGeometrySpatialField and provide it as-such using either WKT or GeoJSON. Supplying a list of points runs the risk that the query won't actually intersect those points. ~ David Smiley Apache

Re: Query field alias - issue with circular reference

2019-09-08 Thread David Smiley
No but this seems like a decent enhancement request. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Fri, Aug 9, 2019 at 3:07 AM Jaroslaw Rozanski wrote: > Hi Folks, > > > > Question about query field aliases. > > > &

Re: upgrading from solr4 to solr8 searches taking 4 to 10 times as long to return

2019-09-07 Thread David Smiley
remove grouping; it's a complexity weight on our codebase. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Sat, Sep 7, 2019 at 5:15 PM David Smiley wrote: > 10s of seconds to respond to a simple match-all query, especially to just > a single sha

Re: upgrading from solr4 to solr8 searches taking 4 to 10 times as long to return

2019-09-07 Thread David Smiley
o see if it's a docValues perf issue compared to uninverting. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Sat, Sep 7, 2019 at 3:06 PM Russell Bahr wrote: > Hi David, > I ran the *:* query 10 times against all 30 servers and the results (below)

Re: upgrading from solr4 to solr8 searches taking 4 to 10 times as long to return

2019-09-05 Thread David Smiley
to see information on each of the components. That may give you a strong clue. If it's in the QueryComponent which actually executes the underlying search then you have some further digging to do. Use a profiler like JVisualVM. ~ David Smiley Apache Lucene/Solr Search Developer http

Re: ExecutorService support in SolrIndexSearcher

2019-08-30 Thread David Smiley
, and in particular Solr's means of flipping bits in a big bitset to accumulate the DocSet had to be careful so that multiple threads don't try to overwrite the same underlying "long" in the long[]. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On M

Re: Solutio for long time highlighting

2019-08-30 Thread David Smiley
Still... there is perhaps some value in multi-threading the highlighting for huge docs, but I think we ultimately found no need after re-engineering the highlighter. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Wed, Aug 28, 2019 at 10:36 AM SOLR4189

[CVE-2019-0193] Apache Solr, Remote Code Execution via DataImportHandler

2019-07-31 Thread David Smiley
The DataImportHandler, an optional but popular module to pull in data from databases and other sources, has a feature in which the whole DIH configuration can come from a request's "dataConfig" parameter. The debug mode of the DIH admin screen uses this to allow convenient debugging / development

Re: Solr Geospatial Polygon Indexing/Querying Issue

2019-07-30 Thread David Smiley
On Tue, Jul 30, 2019 at 4:41 PM Sanders, Marshall (CAI - Atlanta) < marshall.sande...@coxautoinc.com> wrote: > I’ll explain the context around the use case we’re trying to solve and > then attempt to respond as best I can to each of your points. What we have > is a list of documents that in our

Re: Solr Geospatial Polygon Indexing/Querying Issue

2019-07-25 Thread David Smiley
nts directly which makes more sense when multiple spatial fields are in play. Sadly this aspect is not documented. Suffice it to say, if you do geodist(latLng) (maybe quoted?) then it'll use that field, and parse "pt" param from the request. ~ David Smiley Apache Lucene/Solr Search

Re: highlighting not working as expected

2019-06-10 Thread David Smiley
Please try hl.method=unified and tell us if that helps. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Mon, Jun 3, 2019 at 4:06 AM Martin Frank Hansen (MHQ) wrote: > Hi, > > I am having some difficulties making highlighting work. For so

Re: Range query syntax on a polygon field is returning all documents

2019-05-12 Thread David Smiley
posedly is much more efficient for Geo3D specifically. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Wed, Mar 20, 2019 at 2:00 PM David Smiley wrote: > Hi Mitchell, > > Seems like there's a bug based on what you've

Re: Date format issue in solr select query.

2019-05-09 Thread David Smiley
to a string stored field. This is necessary because primitive field types (date, float, int, etc.) normalize the input when the value is internally stored. Perhaps it shouldn't do that -- as you show here the surface form (original) may indicate the precision. ~ David Smiley Apache Lucene/Solr Search

Re: Reverse-engineering existing installation

2019-05-02 Thread David Smiley
Consider trying to diff configs from a default at the version it was copied from, if possible. Even better, the configs should be in source control and then you can browse history with commentary and sometimes links to issue trackers and code reviews. Also a big part that you can’t see by staring

Re: Unable to tag queries (q) in SOLR >= 7.2

2019-04-30 Thread David Smiley
there and *not* defType (don't set defType or set it to "lucene" which is the default). ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Tue, Apr 30, 2019 at 8:17 AM Fredrik Rodland wrote: > Hi. > > I seems SOLR-11501 may have cha

Re: Spatial Search using two separate fields for lat and long

2019-04-13 Thread David Smiley
with the lat & lon separately. Your spatial field could be stored=false, and the separate fields would be stored but otherwise not be indexed or have other characteristics that add weight. The result is efficient; no redundancies. ~ David Smiley Apache Lucene/Solr Search Developer

Re: Slower indexing speed in Solr 8.0.0

2019-04-03 Thread David Smiley
Hi Edwin, I'd like to rule something out. Does your schema define a field "_root_"? If you don't have nested documents then remove it. It's presence adds indexing weight in 8.0 that was not there previously. I'm not sure how much though; I've hoped small but who knows. ~ David Smi

Re: Slower indexing speed in Solr 8.0.0

2019-04-03 Thread David Smiley
What/where is this benchmark? I recall once Ishan was working with a volunteer to set up something like Lucene has but sadly it was not successful On Wed, Apr 3, 2019 at 6:04 AM Đạt Cao Mạnh wrote: > Hi guys, > > I'm seeing the same problems with Shalin nightly indexing benchmark. This >

Re: Range query syntax on a polygon field is returning all documents

2019-03-20 Thread David Smiley
other query syntax e.g. bbox query parser to see if the problem goes away? I doubt this is it but you seem to point to the syntax being related. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Mon, Mar 18, 2019 at 12:24 AM Mitchell Bösecke

Re: Nested geofilt query for LTR feature

2019-03-20 Thread David Smiley
e "geodist" function query. Additionally if you dump the full stack trace here, it might be helpful. Getting a RuntimeException suggests we need to do a better of job wrapping/cleaning errors internally. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmi

Re: regarding debugging solr in eclipse

2019-01-18 Thread David Smiley
On Fri, Jan 18, 2019 at 9:20 AM Scott Stults < sstu...@opensourceconnections.com> wrote: > This blog article might help: > > https://opensourceconnections.com/blog/2013/04/13/how-to-debug-solr-with-eclipse/ > > I don't use Eclipse but I believe things are better now than the instructions given.

Re: Solr 7.2.1 Stream API throws null pointer execption when used with collapse filter query

2019-01-03 Thread David Smiley
File a JIRA issue please On Thu, Jan 3, 2019 at 5:20 PM gopikannan wrote: > Hi, >I am getting null pointer exception when streaming search is done with > collapse filter query. When debugged the last element in FixedBitSet array > is null. Please let me know if I can raise an issue. > > >

Re: Geofilt and distance measurement problems using SpatialRecursivePrefixTreeFieldType field type

2018-12-23 Thread David Smiley
data. In my defence that is > far from obvious in the documentation. > > Thanks again for your help. > > Cheers, > Peter. > > -Original Message- > From: David Smiley [mailto:david.w.smi...@gmail.com] > Sent: 21 December 2018 04:44 > To: solr-user@

Re: Geofilt and distance measurement problems using SpatialRecursivePrefixTreeFieldType field type

2018-12-20 Thread David Smiley
Hi Peter, Use of an RPT field for distance sorting/boosting is to be avoided where possible because it's very inefficient at this specific use-case. Simply use LatLonType for this task, and continue to use RPT for the filter/search use-case. Also I see you putting a space between the

Re: Rectangle with rotation in Solr

2018-09-13 Thread David Smiley
Polygon is the only way. On Wed, Aug 29, 2018 at 7:46 AM Zahra Aminolroaya wrote: > I have locations with 4-tuple (longitude,latitude) which are like > rectangles > and I want to index them. Solr BBoxField with minX, maxX, maxY and minY, > only considers rectangles which does not have

Re: Impact/Performance of maxDistErr

2018-05-30 Thread David Smiley
t helps a lot to understand! > Best Regards > > Jens > > P.S. Currently the only search we are doing on the polygon is > Contains(POINT(x,y)) > > > Am 29.05.2018 um 13:30 schrieb David Smiley: > > Hello Jens, > With solr.RptWithGeometrySpatialField, you always get an ac

Re: Impact/Performance of maxDistErr

2018-05-29 Thread David Smiley
Hello Jens, With solr.RptWithGeometrySpatialField, you always get an accurate result thanks to the "WithGeometry" part. The "Rpt" part is a grid index, and most of the parameters pertain to that. maxDistErr controls the highest resolution grid. No shape will be indexed to higher resolutions

Re: ClassCastException: o.a.l.d.Field cannot be cast to o.a.l.d.StoredField

2018-04-26 Thread David Smiley
> but how would a DocumentTransformer affect UpdateLog replay? Oh right; nevermind that silly theory ;-) On Thu, Apr 26, 2018 at 10:42 AM Markus Jelsma wrote: > Hello David, > > Yes it was sporadic indeed, but how would a DocumentTransformer affect > UpdateLog

Re: Highlighter throwing InvalidTokenOffsetsException for field with large number of synonyms

2018-04-26 Thread David Smiley
Yay! I'm glad the UnifiedHighlighter is serving you well. I was about to suggest it. If you think the fragmentation/snippeting could be improved in a general way then post a JIRA for consideration. Note: identical results with the original Highlighter is a non-goal. On Mon, Apr 23, 2018 at

Re: ClassCastException: o.a.l.d.Field cannot be cast to o.a.l.d.StoredField

2018-04-26 Thread David Smiley
I'm not sure but I wonder why you would want to cast it in the first place. Field is the base class; all it's subclasses are in one way or another utilities/conveniences. In other words, if you ever see code casting Field to some subclass, there's a good chance it's fundamentally wrong or making

Re: PreAnalyzed URP and SchemaRequest API

2018-04-13 Thread David Smiley
Yes I could imagine big gains from this strategy if OpenNLP is in the analysis chain ;-) On Fri, Apr 13, 2018 at 5:01 PM Markus Jelsma wrote: > Hello David, > > If JSON serialization is too bulky, we could also opt for > SimplePreAnalyzed right? At least as a

Re: PreAnalyzed URP and SchemaRequest API

2018-04-12 Thread David Smiley
Ah ok. I've wondered how much value there is in pre-analysis. The serialization of the analyzed form in JSON is bulky. If you can share any results, I'd be interested to hear how it went. It's an optimization so you should be able to know how much better it is. Of course it isn't for everybody

Re: PreAnalyzed URP and SchemaRequest API

2018-04-05 Thread David Smiley
Is this really a problem when you could easily enough create a TextField and call setTokenStream? Does your remote client have Solr-core and all its dependencies on the classpath? That's one way to do it... and presumably the direction you are going because you're asking how to work with

Re: querying vs. highlighting: complete freedom?

2018-04-03 Thread David Smiley
Thanks for your review! On Tue, Apr 3, 2018 at 6:56 AM Arturas Mazeika wrote: ... > What I missed at the beginning of the documentation is the minimal set of > requirements that is reacquired to have highlighting sensible: somehow I > have a feeling that one needs some of the

Re: PreAnalyzed FieldType, and simultaneously importing JSON

2018-04-02 Thread David Smiley
Hello Markus, It appears you are not familiar with PreAnalyzedUpdateProcessor? Using that is much more flexible -- you could have different URP chains for your use-cases. IMO PreAnalyzedField ought to go away. I argued for the URP version and thus it's superiority to the FieldType here:

Re: querying vs. highlighting: complete freedom?

2018-04-02 Thread David Smiley
Hi Arturas, Both Erick and I had a go at improving the documentation here. I hope it's clearer. https://builds.apache.org/job/Solr-reference-guide-master/javadoc/highlighting.html The docs for hl.fl, hl.q, hl.qparser were all updated. The meat of the change was a new note in hl.fl including an

Re: Copying a SolrCloud collection to other hosts

2018-03-28 Thread David Smiley
gt; Some of the original features in that tool have been incorporated into > Solr itself these days, but I still use clonecollection/copycollection > regularly. (most recently with Solr 7.2) > > > On 3/27/18, 9:55 PM, "David Smiley" <david.w.smi...@gmail.com> wrote

Re: Copying a SolrCloud collection to other hosts

2018-03-27 Thread David Smiley
The backup/restore API is intended to address this. https://builds.apache.org/job/Solr-reference-guide-master/javadoc/making-and-restoring-backups.html Erick's advice is good (and I once drafted docs for the same scheme years ago as well), but I consider it dated -- it's what people had to do

Re: InetAddressPoint support in Solr or other IP type?

2018-03-27 Thread David Smiley
ing I was > missing since I couldn't find any discussion on this. > > Michael Cooper > > -Original Message- > From: David Smiley [mailto:david.w.smi...@gmail.com] > Sent: Friday, March 23, 2018 5:14 PM > To: solr-user@lucene.apache.org > Subject: Re: InetAddressPoint support in

Re: InetAddressPoint support in Solr or other IP type?

2018-03-23 Thread David Smiley
Hi, For IPv4, use TrieIntField with precisionStep=8 For IPv6 https://issues.apache.org/jira/browse/SOLR-6741 There's nothing there yet; you could help out if you are familiar with the codebase. Or you might try something relatively simple involving edge ngrams. ~ David On Thu, Mar 22, 2018

Re: Sorting results for spatial search

2018-02-01 Thread David Smiley
quote: "The problem is that this includes children that DON’T touch the search area in the sum. How can I only include the shapes from the first query above in my sort?" Unless I'm misunderstanding your intent, I think this is a simple matter of adding the spatial filter to the parent join query

Re: Sum area polygon solr

2017-11-01 Thread David Smiley
Hi, Ah, no -- sorry. If you want to roll up your sleeves and write a Solr plugin (a ValueSource in this case, perhaps) then you could lookup the index polygon and then call out to JTS to compute the intersection and then ask it for the area. But that's going to be a very heavyweight computation

Re: Retrieve DocIdSet from Query in lucene 5.x

2017-10-24 Thread David Smiley
See SolrIndexSearcher.getDocSet. It may not be identical to what you want but following what it does on through to DocSetUtil.createDocSet may be enlightening. On Fri, Oct 20, 2017 at 5:10 PM Jamie Johnson wrote: > I am trying to migrate some old code that used to retrieve

Re: Solr Spatial Query Problem Hk.

2017-10-04 Thread David Smiley
Hi, Firstly, if Solr returns an error referencing an exception then you can look in Solr's logs for the stack trace, which helps debugging problems a ton (at least for Solr devs). I suspect that the problem here is that your schema might have a dynamic field where *coordinates is defined to be a

Re: Sorting by distance resources with WKT polygon data

2017-09-19 Thread David Smiley
Hello, Sorry for the belated response. Solr only supports sorting from point or rectangles in the index. For rectangles use BBoxField. For points, ideally use the new LatLonPointSpatialField; failing that use LatLonType. You can use RPT for point data but I don't recommend sorting with it;

  1   2   3   4   >