Re: [ANNOUNCE] Apache Solr 7.1.0 released
It pointed to 7.1.0 for me perhaps a browser cache issue? Anyway, you can go directly as well: http://www.apache.org/dyn/closer.lua/lucene/solr/7.1.0 -Yonik On Tue, Oct 17, 2017 at 11:25 AM, Susheel Kumarwrote: > Thanks, Shalin. > > But the download mirror still has 7.0.1 not 7.1.0. > > http://www.apache.org/dyn/closer.lua/lucene/solr/7.0.1 > > > > > On Tue, Oct 17, 2017 at 5:28 AM, Shalin Shekhar Mangar > wrote: >> >> 17 October 2017, Apache Solr™ 7.1.0 available >> >> The Lucene PMC is pleased to announce the release of Apache Solr 7.1.0 >> >> Solr is the popular, blazing fast, open source NoSQL search platform >> from the Apache Lucene project. Its major features include powerful >> full-text search, hit highlighting, faceted search, dynamic >> clustering, database integration, rich document (e.g., Word, PDF) >> handling, and geospatial search. Solr is highly scalable, providing >> fault tolerant distributed search and indexing, and powers the search >> and navigation features of many of the world's largest internet sites. >> >> Solr 7.1.0 is available for immediate download at: >> >> http://lucene.apache.org/solr/mirrors-solr-latest-redir.html >> >> See http://lucene.apache.org/solr/7_1_0/changes/Changes.html for a >> full list of details. >> >> Solr 7.1.0 Release Highlights: >> >> * Critical Security Update: Fix for CVE-2017-12629 which is a working >> 0-day exploit reported on the public mailing list. See >> https://s.apache.org/FJDl >> >> * Auto-scaling: Solr can now move replicas automatically when a new >> node is added or an existing node is removed using the auto scaling >> policy framework introduced in 7.0 >> >> * Auto-scaling: The 'autoAddReplicas' feature which was limited to >> shared file systems is now available for all file systems. It has been >> ported to use the new autoscaling framework internally. >> >> * Auto-scaling: New set-trigger, remove-trigger, set-listener, >> remove-listener, suspend-trigger, resume-trigger APIs >> >> * Auto-scaling: New /autoscaling/history API to show past autoscaling >> actions and cluster events >> >> * New JSON based Query DSL for Solr that extends JSON Request API to >> also support all query parsers and their nested parameters >> >> * JSON Facet API: min/max aggregations are now supported on >> single-valued date fields >> >> * Lucene's Geo3D (surface of sphere & ellipsoid) is now supported on >> spatial RPT fields by setting spatialContextFactory="Geo3D". >> Furthermore, this is the first time Solr has out of the box support >> for polygons >> >> * Expanded support for statistical stream evaluators such as various >> distributions, rank correlations, distances and more. >> >> * Multiple other optimizations and bug fixes >> >> You are encouraged to thoroughly read the "Upgrade Notes" at >> http://lucene.apache.org/solr/7_1_0/changes/Changes.html or in the >> CHANGES.txt file accompanying the release. >> >> Solr 7.1 also includes many other new features as well as numerous >> optimizations and bugfixes of the corresponding Apache Lucene release. >> >> Please report any feedback to the mailing lists >> (http://lucene.apache.org/solr/discussion.html) >> >> Note: The Apache Software Foundation uses an extensive mirroring >> network for distributing releases. It is possible that the mirror you >> are using may not have replicated the release yet. If that is the >> case, please try another mirror. This also goes for Maven access. >> >> -- >> Regards, >> Shalin Shekhar Mangar. >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >
Re: Welcome David Smiley to the PMC
On Mon, Mar 18, 2013 at 10:28 AM, Smiley, David W. dsmi...@mitre.org wrote: Thanks Steve, and to the rest of the PMC members! I hope to see many of you at Lucene/Solr Revolution in May. +1 Welcome! -Yonik http://lucidworks.com
Re: Welcome Alan Woodward as Lucene/Solr committer
Congrats and welcome, Alan! -Yonik http://lucidworks.com On Wed, Oct 17, 2012 at 1:36 AM, Robert Muir rcm...@gmail.com wrote: I'm pleased to announce that the Lucene PMC has voted Alan as a Lucene/Solr committer. Alan has been contributing patches on various tricky stuff: positions iterators, span queries, highlighters, codecs, and so on. Alan: its tradition that you introduce yourself with your background. I think your account is fully working and you should be able to add yourself to the who we are page on the website as well. Congratulations!
Re: SOLR Sorting algorithm
When sorting, ties are broken by the internal document id. This gives us a stable (if somewhat arbitrary) sort ordering. If you want score to be the tiebreaker, you can specify it as the secondary sort. -Yonik http://www.lucene-eurocon.com - The Lucene/Solr User Conference On Tue, Sep 6, 2011 at 1:49 PM, BrianK brian.krue...@bonton.com wrote: We are running a SOLR query and are specifying a custom sort field to sort our results based on our sort field rather than the default score. For the most part, the results are sorting by our field, but SOLR appears to be sorting the results by some other field or alogorithm and it's not the score field. Our documents are populated from a database table and when running a similar query/sort against the database we don't get the same sort sequence as SOLR even though the sort is on the same field in both systems. IMPORTANT NOTE: the sort field/results field is not unique, the search results in question have the same value (1 in this case), but the results always come out in the same order. Can someone explain or point me in the right direction to determine how SOLR sorts results beyond the field specified in our query string. Example Query: q=Kitchen Productssort=sortSequence asc Example Results: name: Product 1 sortSequence: 1 score: 1.52221 name: Product 5 sortSequence: 1 score: 1.52221 name: Product 3 sortSequence: 1 score: 1.53112 name: Product 2 sortSequence: 2 score: 1.51112 etc. Are there hidden fields like document date, creation date, or other field that is not visible that might be factored into a sort? -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Sorting-algorithm-tp3314295p3314295.html Sent from the Lucene - General mailing list archive at Nabble.com.
Re: SOLR Sorting algorithm
On Tue, Sep 6, 2011 at 4:48 PM, BrianK brian.krue...@bonton.com wrote: by internal document id you are referring to a field that is not visible to us. We have an id field, I assume this is not the document id field you are talking about. Assuming document id is not available to us, is it sorting this ascendind/descending and is the document id simply a sequential number assigned as documents are loaded/indexed by solr? Correct - it's not a field, but just the internal index or ord into the internal data structures. It's also transient, in that it can change across commit calls (either by deleted documents being squeezed out, or by non-adjacent segments being merged). -Yonik http://www.lucene-eurocon.com - The Lucene/Solr User Conference
[VOTE] Create Solr TLP
A single merged project works only when people are relatively on the same page, and when people feel it's mutually beneficial. Recent events make it clear that that is no longer the case. Improvements to Solr have been recently blocked and reverted on the grounds that the new functionality was not immediately available to non-Solr users. This was obviously never part of the original idea (well actually - it was considered but rejected as too onerous). But the past doesn't matter as much as the present - about how people chose to act and interpret things today. https://issues.apache.org/jira/browse/SOLR-2272 http://markmail.org/message/unrvjfudcbgqatsy Some people warned us against merging at the start, and I guess it turns out they were right. I no longer feel it's in Solr's best interests to remain under the same PMC as Lucene-Java, and I know some other committers who have said they feel like Lucene got the short end of the stick. But rather than arguing about who's right (maybe both?) since enough of us feel it's no longer mutually beneficial, we should stop fighting and just go our separate ways. Please VOTE to create a new Apache Solr TLP. Here's my +1 -Yonik
Re: [DISCUSS] Lucene Java - Lucene Core
On Mon, Nov 8, 2010 at 1:02 PM, Uwe Schindler u...@thetaphi.de wrote: Die, Contrib, die! We will hopefully only have modules soon? +1 to Lucene Core, Lucene Modules and Solr. As qualifier we can use for Java to differentiate from .NET. But in my opinion, all others should be separate projects and the main project is called Lucene Family for Java (without family but I like it). Right - Lucene Core, Modules, Solr are all the same project. We're only coming up with these different labels because some of the parts may be downloaded and/or documented separately (and have a pre-existing brand associated with that). So distinct labels make sense for Lucene and Solr, but not for contrib and not for Modules (at least not yet). I understand Steven's concern too - the download for Lucene Core is likely to have contrib stuff for some time to come, so there would logically be core and contrib parts to Lucene Core. Although in practice, I don't think that little bit of ambiguity is likely to cause problems. -Yonik http://www.lucidimagination.com
Re: New LuSolr trunk (was: RE: (LUCENE-2297) IndexWriter should let you optionally enable reader pooling)
For Solr, we can just move the current trunk to a 15 branch. -Yonik On Tue, Mar 23, 2010 at 9:39 AM, Grant Ingersoll gsing...@apache.org wrote: On Mar 22, 2010, at 8:27 AM, Uwe Schindler wrote: Hi all, the discussion where to do the development after the merge, now gets actual: Currently a lusolr test-trunk is done as a branch inside solr (https://svn.apache.org/repos/asf/lucene/solr/branches/newtrunk). The question is, where to put the main development and how to switch, so non-developers that have checkouts of solr and/or lucene will see the change and do not send us outdated patches. I propose to do the following: - Start a new top-level project folder inside /lucene root svn folder: https://svn.apache.org/repos/asf/lucene/lusolr (please see lusolr as a placeholder name) and add branches, tags subfolders to it. Do not create trunk and do this together with the next step. OK, I created https://svn.apache.org/repos/asf/lucene/dev/ and given appropriate rights. Uwe, you can now do the rest of the move. Once you've done it, let me know and I can make sure to add back the contrib rights. - Move the branch from https://svn.apache.org/repos/asf/lucene/solr/branches/newtrunk to this new directory as trunk - For lucene flexible indexing, create a corresponding flex branch there and svn copy it from current new trunk. Merge the lucene flex changes into it. Alternatively, land flex now. Or simply do svn copy of current flex branch instead of merging (may be less work). - Do the same for possible solr branches in development - Create a tag in the lucene tags folder and in the solr tags folder with the current state of each trunk. After that delete all contents from old trunk in solr and lucene and place a readme file pointing developers to the new merged trunk folder (for both old trunks). This last step is important, else people who checkout the old trunk will soon see a very outdated view and may send us outdated patches in JIRA. When the contents of old-trunk disappear it's obvious to them what happened. If they had already some changes in their checkout, the svn client will keep the changed files as unversioned (after upgrade). The history keeps available, so it's also possible to checkout an older version from trunk using @rev or -r rev. I did a similar step with some backwards compatibility changes in lucene (add a README). Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Monday, March 22, 2010 11:37 AM To: java-...@lucene.apache.org Subject: Re: (LUCENE-2297) IndexWriter should let you optionally enable reader pooling I think we should. It (newtrunk) was created to test Hoss's side-by-sdie proposal, and that approach looks to be working very well. Up until now we've been committing to the old trunk and then systematically merging over to newtrunk. I think we should now flip that, ie, commit to newtrunk and only merge back to the old trunk if for some strange reason it's needed. Mike On Mon, Mar 22, 2010 at 6:32 AM, Uwe Schindler u...@thetaphi.de wrote: Are we now only working on newtrunk? - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael McCandless (JIRA) [mailto:j...@apache.org] Sent: Monday, March 22, 2010 11:22 AM To: java-...@lucene.apache.org Subject: [jira] Resolved: (LUCENE-2297) IndexWriter should let you optionally enable reader pooling [ https://issues.apache.org/jira/browse/LUCENE- 2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all- tabpanel ] Michael McCandless resolved LUCENE-2297. Resolution: Fixed Fixed on newtrunk. IndexWriter should let you optionally enable reader pooling --- Key: LUCENE-2297 URL: https://issues.apache.org/jira/browse/LUCENE- 2297 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Priority: Minor Fix For: 3.1 Attachments: LUCENE-2297.patch For apps using a large index and frequently need to commit and resolve deletes, the cost of opening the SegmentReaders on demand for every commit can be prohibitive. We an already pool readers (NRT does so), but, we only turn it on if NRT readers are in use. We should allow separate control. We should do this after LUCENE-2294. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail:
Re: Branding Solr+Lucene
On Mon, Mar 22, 2010 at 2:20 PM, Ryan McKinley ryan...@gmail.com wrote: I'm confused... what is the need for a new name? The only place where there is a conflict is in the top level svn tree... Agree, no need to re-brand. What about something general like: https://svn.apache.org/repos/asf/lucene/dev or https://svn.apache.org/repos/asf/lucene/project Hmmm, that one isn't bad. -Yonik
Re: [VOTE] merge lucene/solr development (take 3)
On Sun, Mar 14, 2010 at 11:02 AM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Would it be correct to say that a subset of Lucene/Solr committers discussed the proposal internally/offline (i.e. not on MLs) before proposing it? Nope. Where did this idea come from? I'm quite sure my proposal (my original we-should-just-merge email) was a surprise to everyone. I discussed it with no one previously. All of the related discussions in previous months had been about pulling stuff out of Solr, why that was disadvantageous to Solr, etc, etc. -Yonik
Re: [VOTE] merge lucene/solr development (take 3)
On Sun, Mar 14, 2010 at 2:36 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: if I understand things correctly, poaching is only needed when the code is not committed in the right project/location to begin with. That is the problem though - Solr should be allowed to keep whatever code was written under it's control, w/o pressure to put it in Lucene (and often out of reach). And Lucene should be able to poach what it wants from Solr. But with the projects already half overlapping... it was a recipe for conflict. We've already had conflicts about this in the past. The conflicts were either going to get worse over time, esp with Solr not on Lucene's trunk, or we were going to merge. We've decided to tear down the artificial wall and work together. Some people suggest that this could have worked w/o merging. I disagreed, as I think the majority of those voting +1 disagreed. Not sure who's following lucene-dev and solr-dev, but the committers have already been merged. We're not standing still... -Yonik
Re: [VOTE] merge lucene/solr development (take 3)
On Sun, Mar 14, 2010 at 2:58 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Would it make sense to think of Solr as one such Lucene module? In other words, don't even bother with merging just the -dev lists, but really just merge everything. In that case Solr's relationship with Lucene core becomes much like the relationship Lucene contribs have with Lucene core today in terms of compatibility, builds, and committers' responsibilities? That kind of makes sense to me. Of course, because of the sheer volume we may want to keep -user lists separate and possibly even create new ones for Lucene modules that attract enough interest on their own. Yes, the general gist of that all makes sense. merge-everything is more along the lines of the original discussion (we just needed to enumerate some specific action items in the vote). The things we probably don't merge are just for user convenience. Separate downloads websites user lists. Might have made sense to merge JIRA, but there are just so many open issues... it prob wouldn't be practical. And yes, more user lists in the future could even make sense - say a separate one for DIH. -Yonik
Re: [VOTE] merge lucene/solr development (take 3)
Thanks everyone, this vote has passed. A bit more contentious of a PMC vote than usual, but the committer vote was clear. -Yonik On Mon, Mar 8, 2010 at 9:11 PM, Yonik Seeley ysee...@gmail.com wrote: Apoligies in advance for calling yet another vote, but I just wanted to make sure this was official. Mike's second VOTE thread could probably technically stand on it's own (since it included PMC votes), but given that I said in my previous VOTE thread that I was just polling Lucene/Solr committers and would call a second PMC vote, that may have acted to suppress PMC votes on Mike's thread also. Please vote for the proposal quoted below to merge lucene/solr development. Here's my +1 -Yonik Mike's call for a VOTE (amongst lucene/solr committers +11 to -1): http://search.lucidimagination.com/search/document/a400ffe62ae21aca/vote_merge_the_development_of_solr_lucene_take_2#22d7cd086d9c5cf0 Subject: Merge the development of Solr/Lucene (take 2) A new vote, that slightly changes proposal from last vote (adding only that Lucene can cut a release even if Solr doesn't): * Merging the dev lists into a single list. * Merging committers. * When any change is committed (to a module that belongs to Solr or to Lucene), all tests must pass. * Release details will be decided by dev community, but, Lucene may release without Solr. * Modulariize the sources: pull things out of Lucene's core (break out query parser, move all core queries analyzers under their contrib counterparts), pull things out of Solr's core (analyzers, queries). These things would not change: * Besides modularizing (above), the source code would remain factored into separate dirs/modules the way it is now. * Issue tracking remains separate (SOLR-XXX and LUCENE-XXX issues). * User's lists remain separate. * Web sites remain separate. * Release artifacts/jars remain separate.
Re: [VOTE] merge lucene/solr development (take 3)
I think the problem is political - and that leads to both technical and political problems. We came up with a largely political solution that should solve both. We can't have a one way street of pulling everything interesting out of Solr for Lucene, or poaching, or expanding Lucene's domain while shrinking Solr's (just limit to server stuff, etc). Lucene and Solr committers are headed down the road toward greater competition - but with this proposal, we said we'd rather work together instead. -Yonik
Re: [VOTE] merge lucene/solr development (take 3)
On Tue, Mar 9, 2010 at 9:48 AM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: I have built 10s of projects that have simply used Lucene as an API and had no need for Solr, and I've built 10s of projects where Solr made perfect sense. So, I appreciate their separation. As does everyone - which is why there will always be separate downloads. As a user, the only side affect you should see is an improved Lucene and Solr. Saying that Solr should move some stuff to Lucene for Lucene's benefit, without regard to if it's actually benefitial to Solr, is a non-starter. The lucene/solr committers have been down that road before. The solution that most committers agreed would improve the development of both projects is to merge development. -Yonik
Re: [VOTE] merge lucene/solr development (take 3)
On Tue, Mar 9, 2010 at 11:00 AM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: However, like I said it seems to be like the discussion of the real issues is only happening recently over the past few days. This certainly isn't new territory for lucene/solr devs though - the issue of what belongs in Solr and what belongs in Lucene, and problems around pulling out schema and faceting and putting it in Lucene have come up before (also in lengthy threads). -Yonik
Re: [VOTE] merge lucene/solr development (take 3)
On Tue, Mar 9, 2010 at 11:35 AM, Michael Busch busch...@gmail.com wrote: No matter if this dev-merge vote passes or not, we still want a separate analysis module, right? No. That's the point of the dev merge - to allow free movement of source code that benefits both projects. -Yonik
[VOTE] merge lucene/solr development (take 3)
Apoligies in advance for calling yet another vote, but I just wanted to make sure this was official. Mike's second VOTE thread could probably technically stand on it's own (since it included PMC votes), but given that I said in my previous VOTE thread that I was just polling Lucene/Solr committers and would call a second PMC vote, that may have acted to suppress PMC votes on Mike's thread also. Please vote for the proposal quoted below to merge lucene/solr development. Here's my +1 -Yonik Mike's call for a VOTE (amongst lucene/solr committers +11 to -1): http://search.lucidimagination.com/search/document/a400ffe62ae21aca/vote_merge_the_development_of_solr_lucene_take_2#22d7cd086d9c5cf0 Subject: Merge the development of Solr/Lucene (take 2) A new vote, that slightly changes proposal from last vote (adding only that Lucene can cut a release even if Solr doesn't): * Merging the dev lists into a single list. * Merging committers. * When any change is committed (to a module that belongs to Solr or to Lucene), all tests must pass. * Release details will be decided by dev community, but, Lucene may release without Solr. * Modulariize the sources: pull things out of Lucene's core (break out query parser, move all core queries analyzers under their contrib counterparts), pull things out of Solr's core (analyzers, queries). These things would not change: * Besides modularizing (above), the source code would remain factored into separate dirs/modules the way it is now. * Issue tracking remains separate (SOLR-XXX and LUCENE-XXX issues). * User's lists remain separate. * Web sites remain separate. * Release artifacts/jars remain separate.
Re: [VOTE] merge lucene/solr development (take 3)
On Mon, Mar 8, 2010 at 9:22 PM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: For completeness from the VOTE on private@ It's called private for a reason. -Yonik
Re: [VOTE] merge lucene/solr development (take 3)
On Mon, Mar 8, 2010 at 9:49 PM, Michael Busch busch...@gmail.com wrote: Question: Is it sufficient to have more +1s than -1s for this vote to pass? 3 +1s and more +1s than -1s is sufficient. I thought for votes as significant as this one a -1 veto is a showstopper? It's not really tied to significance - releases, acceptance to incubate, etc, all require more +1s than -1s. -Yonik
Re: [VOTE] merge lucene/solr development
+1 Great idea! :-) -Yonik On Wed, Mar 3, 2010 at 5:42 PM, Yonik Seeley yo...@apache.org wrote: Many Lucene/Solr committers think that merging development would be a benefit to both projects. Separate downloads would remain (among other things), so end users would not be impacted (except for higher quality products over time). Since this is a change to Lucene/Solr project development, I'd like to get a format vote from the committers of both projects. If there are 3 +1s and more +1s than -1s, we can pass this to the Lucene PMC to ratify. -Yonik Discussion thread: http://search.lucidimagination.com/search/document/c7817932400808ad/factor_out_a_standalone_shared_analysis_package_for_nutch_solr_lucene
Re: [VOTE] Merge the development of Solr/Lucene (take 2)
+1 -Yonik On Thu, Mar 4, 2010 at 4:33 PM, Michael McCandless luc...@mikemccandless.com wrote: A new vote, that slightly changes proposal from last vote (adding only that Lucene can cut a release even if Solr doesn't): * Merging the dev lists into a single list. * Merging committers. * When any change is committed (to a module that belongs to Solr or to Lucene), all tests must pass. * Release details will be decided by dev community, but, Lucene may release without Solr. * Modulariize the sources: pull things out of Lucene's core (break out query parser, move all core queries analyzers under their contrib counterparts), pull things out of Solr's core (analyzers, queries). These things would not change: * Besides modularizing (above), the source code would remain factored into separate dirs/modules the way it is now. * Issue tracking remains separate (SOLR-XXX and LUCENE-XXX issues). * User's lists remain separate. * Web sites remain separate. * Release artifacts/jars remain separate. Mike
[VOTE] merge lucene/solr development
Many Lucene/Solr committers think that merging development would be a benefit to both projects. Separate downloads would remain (among other things), so end users would not be impacted (except for higher quality products over time). Since this is a change to Lucene/Solr project development, I'd like to get a format vote from the committers of both projects. If there are 3 +1s and more +1s than -1s, we can pass this to the Lucene PMC to ratify. -Yonik Discussion thread: http://search.lucidimagination.com/search/document/c7817932400808ad/factor_out_a_standalone_shared_analysis_package_for_nutch_solr_lucene
Re: [VOTE] merge lucene/solr development
On Wed, Mar 3, 2010 at 7:41 PM, Mark Miller markrmil...@gmail.com wrote: I'm only for the merge with aligned releases - its the only way Solr can really stay on Lucene trunk happily. Aligned releases are also my biggest worry (and part of why I initially leaned against such a merge), but without it, there goes all of the larger reasons I'm into the merge now - Solr can be on trunk and we can have better sharing / less duplication between the projects - which I personally think requires Solr being on Lucene trunk - or it won't really work at all. And Solr being on trunk really needs aligned releases Correct - I believe most who agreed with a merge essentially agreed on all the points about what a merge meant. Merged dev list, committers, and releases. Maintain separate downloads, user lists. I wanted to avoid the lawyers... if we need to hammer out all the little things that aren't as important (sync release numbers, etc) discussion will be endless and we'll never get anywhere. -Yonik
Re: Factor out a standalone, shared analysis package for Nutch/Solr/Lucene?
On Fri, Feb 26, 2010 at 5:15 PM, Steven A Rowe sar...@syr.edu wrote: On 02/24/2010 at 2:20 PM, Yonik Seeley wrote: I've started to think that a merge of Solr and Lucene would be in the best interest of both projects. The Sorlucene :) merger could be achieved virtually, i.e. via policy, rather than physically merging: Everything is virtual here anyway :-) I agree with Mike that a single dev list is highly desirable. There would still be separate downloads. What to do with some of the other stuff is unspecified. Committers would need to be merged though - that's the only way to make a change across projects w/o breaking stuff. -Yonik
Re: Factor out a standalone, shared analysis package for Nutch/Solr/Lucene?
I've started to think that a merge of Solr and Lucene would be in the best interest of both projects. Recently, Solr as pulled back from using Lucene trunk (or even the latest version), as the increased amount of change between releases (and in-between releases) made it impractical to deal with. This is a pretty big negative for Lucene, since Solr is the biggest Lucene user (where people are directly exposed to lucene for the express purpose of developing search features). I know Solr development has always benefited hugely from users using trunk, and Lucene trunk has now lost all the solr users. Some in Lucene development have expressed a desire to make Lucene more of a complete solution, rather than just a core full-text search library... things like a data schema, faceting, etc. The Lucene project already has an enterprise search platform with these features... that's Solr. Trying to pull popular pieces out of Solr makes life harder for Solr developers, brings our projects into conflict, and is often unsuccessful (witness the largely failed migration of FunctionQueries from Solr to Lucene). For Lucene to achieve the ultimate in usability for users, it can't require Java experience... it needs higher level abstractions provided by Solr. The other benefit to Lucene would be to bring features to developers much sooner... Solr has had features years before they were developed in Lucene, and currently has more developers working with it. Esp with Solr not using Lucene trunk, if a Solr developer wants a feature quickly, they cannot add it to Lucene (even if it might make sense there) since that introduces a big unpredictable lag - when that version of Lucene make it's way into Solr. The current divide is a bit unnatural. For maximum benefit of both projects, it seems like Solr and Lucene should essentially merge. Lucene core would essentially remain as it is, but: 1) Solr would go back to using Lucene's trunk 2) For new Solr features, there would be an effort to abstract it such that non-Solr users could use the functionality (faceting, field collapsing, etc) 3) For new Lucene features, there would be an effort to integrate it into Solr. 4) Releases would be synchronized... Lucene and Solr would release at the same time. -Yonik
Re: [spatial] Cartesian Tiers nomenclature
On Tue, Dec 29, 2009 at 7:13 PM, Marvin Humphrey mar...@rectangular.com wrote: ... but for this algorithm, different rasterization resolutions need not proceed by powers-of-two. Indeed - one way to further generalize would be to use something like Lucene's trie-based Numeric field, but with a square instead of a line. That would allow to tweak the space/speed tradeoff. -Yonik http://www.lucidimagination.com
Re: [VOTE] Release Apache Lucene Java 2.9.1, take 3
On Thu, Oct 29, 2009 at 7:27 PM, Michael McCandless luc...@mikemccandless.com wrote: OK, let's try this again! I've built new release artifacts from svn rev 831145 (on the 2.9 branch), here: http://people.apache.org/~mikemccand/staging-area/rc3_lucene2.9.1/ Changes are here: http://people.apache.org/~mikemccand/staging-area/rc3_lucene2.9.1changes/ Please vote to officially release these artifacts as Apache Lucene Java 2.9.1. +1 -Yonik http://www.lucidimagination.com
Re: [VOTE] Release Solr 1.4.0
On Thu, Oct 29, 2009 at 8:49 AM, Uwe Schindler u...@thetaphi.de wrote: Yes, it's too bad! But you will replace the lucene jars in the artifacts before releasing? Because it would not be good to have jar files with version 2.9.1 in the package that are not the officially released 2.9.1 artifacts. Darn... forgot about the version number in the jars. Sigh. -Yonik http://www.lucidimagination.com
Re: [VOTE] Release Solr 1.4.0
On Thu, Oct 29, 2009 at 9:07 AM, Bill Au bill.w...@gmail.com wrote: I think someone has already pointed this out before. On numerous occasions I have had to dig into the Lucene code when writing code to extend Solr. So it will be much earier to make sure that I am looking at the right code if Solr uses an official release of Lucene, as opposed to a particular SVN revision. And it's much easier for people to use a Solr release if we could actually *release* one!!! But yes, it looks like we will spin a new Solr release. -Yonik http://www.lucidimagination.com Bill On Thu, Oct 29, 2009 at 8:59 AM, Grant Ingersoll gsing...@apache.orgwrote: Yeah, unfortunately, I think we need to use the new Jars. On Oct 29, 2009, at 8:52 AM, Yonik Seeley wrote: On Thu, Oct 29, 2009 at 8:49 AM, Uwe Schindler u...@thetaphi.de wrote: Yes, it's too bad! But you will replace the lucene jars in the artifacts before releasing? Because it would not be good to have jar files with version 2.9.1 in the package that are not the officially released 2.9.1 artifacts. Darn... forgot about the version number in the jars. Sigh. -Yonik http://www.lucidimagination.com
Re: [VOTE] Release Apache Lucene Java 2.9.1
On Mon, Oct 26, 2009 at 12:43 PM, Uwe Schindler u...@thetaphi.de wrote: Looks good. One thing: In Mark's artifacts, he changed the common-build.xml to not have -dev in the version before the release. You can see this in SVN. I am fine with having -dev in the source artefact, because if someone compiles his own bin from the artefact, it should have -dev in it, because it's not an official build. Right, having the -dev when someone tries to build it themselves is the way we should keep it. -Yonik http://www.lucidimagination.com
Re: [VOTE] Release Solr 1.4.0
Hmmm, weren't you going to update the version numbers to 1.4.1-dev like we just discussed in the other thread? That way if someone changes some of the solr source from the download and recompiles, they don't get a version number of 1.4.0 -Yonik http://www.lucidimagination.com On Mon, Oct 26, 2009 at 6:15 PM, Grant Ingersoll gsing...@apache.org wrote: Tis the season for releases... Please vote on releasing the Solr 1.4.0 artifacts located at http://people.apache.org/~gsingers/solr/1.4.0/ (note, solr.tar and solr-maven.tar are not artifacts to be released) CHANGES are spelled out at https://svn.apache.org/repos/asf/lucene/solr/branches/branch-1.4/CHANGES.txt Thanks, Grant
Re: [VOTE] Release Solr 1.4.0
On Mon, Oct 26, 2009 at 9:58 PM, Grant Ingersoll gsing...@apache.org wrote: OK, take two is up in the same place. Please vote. I'm seeing emptiness at http://people.apache.org/~gsingers/solr/1.4.0/ -Yonik http://www.lucidimagination.com On Oct 26, 2009, at 6:15 PM, Grant Ingersoll wrote: Tis the season for releases... Please vote on releasing the Solr 1.4.0 artifacts located at http://people.apache.org/~gsingers/solr/1.4.0/ (note, solr.tar and solr-maven.tar are not artifacts to be released) CHANGES are spelled out at https://svn.apache.org/repos/asf/lucene/solr/branches/branch-1.4/CHANGES.txt Thanks, Grant
Re: [ACUS09] Proposed Schedule
On Tue, Jul 14, 2009 at 4:53 PM, Uwe Schindleru...@thetaphi.de wrote: NumericRangeQuery is not only geographical search... So it would also cover other directions: Things I can do with Lucene additionally to full text search, that could be done before only with RDBMS and/or PostGIS...: Do full text search with scoring and so on in addition to filter my products by price and availability in shops at specific geographic regions; newspaper articles about Arnold and national bankruptcy *g* from a datetime range sorted by article size,... (we know all possibilities we have now by numeric/geo search). Let's not over-state the case: people have been doing all that for years with Lucene. There were many methods to deal with numeric encoding and slow range queries. The Trie* stuff has made it both easier and faster out-of-the-box - a good thing. -Yonik http://www.lucidimagination.com
Re: how to control the disk size of the indices
On Mon, Mar 24, 2008 at 9:34 PM, Otis Gospodnetic [EMAIL PROTECTED] wrote: Hi Yannis, I don't think there is anything of that sort in Lucene, but this shouldn't be hard to do with a process outside Lucene. Of course. optimizing an index increases its size temporarily, so your external process would have to take that into account and play it safe. You could also set mergeFactor to 1, which should keep your index in a fully optimized state MergeFactor must be = 2 You will always need to allow for double the index size due to increased temporary disk usage during segment merges (including optimize). Peak use on a system being searched and indexed concurrently will often be even higher since currently open readers reference files that have been deleted. -Yonik
Solr graduates and joins Lucene as sub-project
Solr has just graduated from the Incubator, and has been accepted as a Lucene sub-project! Thanks to all the Lucene and Solr users, contributors, and developers who helped make this happen! I have a feeling we're just getting started :-) -Yonik
Re: Searching by bit masks
On 11/9/06, ltaylor.employon [EMAIL PROTECTED] wrote: I am currently evaluating Lucene to see if it would be appropriate to replace my company's current search software. So far everything has been looking great, however there is one requirement that I am not too certain about. What we need to do is to be able to store a bit mask specifying various filter flags for a document in the index and then search this field by specifying another bit mask with desired filters, returning documents that have any of the specified flags set. In other words, we are doing a bitwise OR on the stored filter bit mask and the specified filter bit mask and if it is non-zero, we want to return the document. Lucene maintains an inverted index, so you don't need a bit mask... you can actually use symbolic values. doc { id=1 tags = tag1 tag3 tag7 } doc { id = 2 tags = tag1 tag2 tag5 tag9 } Then you can search via a BooleanQuery: tags:(tag1 OR tag2 OR tag7) If you are new to Lucene, you might check out Solr first. If nothing else, it would be a gentle introduction to Lucene, and you could build a custom Lucene implementation later if it doesn't meet your needs. -Yonik http://incubator.apache.org/solr Solr, the open-source Lucene search server
Re: [PROPOSAL] index server project
On 10/19/06, Steven Parkes [EMAIL PROTECTED] wrote: You mention partitioning of indexes, though mostly around delete. What about scalability of corpus size? Definitely in scope. Solr already has scalability of search volume via searchers behind of a load balancer all getting their index from a master. The problem comes when an index is too big to get decent latency for a single query, and that's when you need to partiton the index into shards to use google terminology. Would partitioning be effective for that, too? Yes, to a certain extent. At some point you run into network bandwidth issues if you go deep into rankings. What about scalability of ingest rate? As it relates to indexing, I think nutch already has that base covered. What are you thinking, in terms of size? Is this a 10 node thing? I'm personally interested in perhaps 10 to 20 index shards, with multiple replicas of each shard for HA and query load scalability. A 1000 node thing? More? Bigger is cool, but raises a lot of issues. Should be possible, but I won't personally be looking for that. I think scaling effectively will be partially in the hands of the client and how it chooses to merge results from shards. How dynamic? Can nodes come and go? Unplanned: yes. HA is personally key for me. Planned (adding capacity gracefully): it would be nice. I actually hadn't planned it for Solr. Are you going to assume homogeneity of nodes? Hardware homogeneity? That might be out of scope... I'd start off without worrying about it in any case. What about add/modify/delete to search visibility latency? Close to batch/once-a-day or real-time? Anywhere in between I'd think. Realtime latencies of minutes or longer are normally fine. -Yonik http://incubator.apache.org/solr Solr, the open-source Lucene search server
Re: [PROPOSAL] index server project
On 10/18/06, Doug Cutting [EMAIL PROTECTED] wrote: Does this make sense? Does it sound like it would be useful to Solr? To Nutch? To others? Who would be interested and able to work on it? Rather than holding my tounge until I wrap my head around all the issues, I'll say that I'm definitely interested! -Yonik
Re: Infrastructure for large Lucene index
On 10/6/06, James [EMAIL PROTECTED] wrote: Our indexes are, in aggregate across our various collections, even larger than you need. We use Remote ParalellMultiSearcher, with some custom modifications (and we are in the process of making more) I'm looking into adding some form of distributed search to Solr. The main problem I see with directly using ParallelMultiSearcher is a lack of high availability features. If the index is broken into multiple shards then we need multiple copies of each shard, and some way of loadbalancing and failing over amongst copies of shards. -Yonik http://incubator.apache.org/solr Solr, the open-source Lucene search server
Re: Infrastructure for large Lucene index
On 10/6/06, Slava Imeshev [EMAIL PROTECTED] wrote: -- James [EMAIL PROTECTED] wrote: If the index is broken into multiple shards then we need multiple copies of each shard, and some way of loadbalancing and failing over amongst copies of shards. Yep. Unfortunately it's not simple, but those are all pieces of what we are currently in the process of implementing. The problem is that over time indexes develop personality and the term frequency can be vary significantly from index to index A global idf calculation is possible though... MultiSearcher already does this when searching across multiple indicies. The downside of doing it across remote indicies is an increase in the number of RPC calls. In general, it's probably better to try and keep index shards balanced. -Yonik http://incubator.apache.org/solr Solr, the open-source Lucene search server
Re: Binary fields in index
Binary fields can be stored, but not indexed. -Yonik Now hiring -- http://tinyurl.com/7m67g On 9/26/05, Fredrik Andersson [EMAIL PROTECTED] wrote: I was hoping to avoid the overhead of encoding/decoding, but it looks like I'll have to do that :( While on the topic, I noticed in the Field class that we have a isBinary boolean flag, however this always gets set to false in the constructors as well as the default value, and I can't even see a usage of this flag at write-time. What's the point of this flag, a feature for binary fields that was never implemented? I'm talking about the latest sources now, by the way, 1.9.something. Fredrik On 9/26/05, Koji Sekiguchi [EMAIL PROTECTED] wrote: You can encode (e.g. base64) the binary data to get a String and store the String. Koji -Original Message- From: Fredrik Andersson [mailto:[EMAIL PROTECTED] Sent: Monday, September 26, 2005 6:31 PM To: general@lucene.apache.org Subject: Binary fields in index Hello Gang! Is there any trick, or undocumented way, to store binary (unindexed, untokenized) data in a Lucene Field? All the Field constructors just deal with Strings. I'm currently using another database to store binary data, but it would be very neat, and more efficient, to store it directly in Lucene. Thanks in advance, Fredrik