Hi Ahmed,
I think you should repost your question on the u...@nutch.apache.org list since
Nutch is its own Top Level Project now.
HTH,
Chris
On May 5, 2011, at 7:38 AM, Ahmed Moustafa wrote:
Hello;
I have question in Nutch version 1.2?
Can any one know about how nutch 1.2 (crawl or
Thanks, Marvin.
I'm pleased to relay that Chris Hostetter has agreed to take on the role of
Champion and Chris Mattmann has agreed to be a Mentor.
Here are the changes that the draft proposal has undergone since it was posted
here on Friday:
Hi Itamar,
I think what you would do is throw together a proposal mentioning things
like:
* who would be the initial committers for the project
* whether those committers have Apache ICLAs [1] on file, or not
* what¹s the rationale behind the project (yours would have strong
rationale, since
Hi Guys,
I updated the Lucene website Forrest pages to reflect the move of Nutch and
Tika to TLP. The site should be updated within a few hours as I just SVN
up'ed on people.a.o.
Thanks!
Cheers,
Chris
++
Chris Mattmann, Ph.D.
Oooh I'll fix that, just gotta copy it to our new location on people.a.o.
Heh, sorry, one sec!
Cheers,
Chris
On 5/12/10 9:12 AM, Andrzej Bialecki a...@getopt.org wrote:
On 2010-05-12 16:29, Mattmann, Chris A (388J) wrote:
Hi Guys,
I updated the Lucene website Forrest pages to reflect
a TLP that you delay this release by a
few weeks and have the vote done under the auspices of the Nutch PMC?
Cheers,
Grant
On Apr 26, 2010, at 1:55 AM, Mattmann, Chris A (388J) wrote:
Hi Folks,
I have posted an updated candidate for the Apache Nutch 1.1 release. The
source code
Hi Hoss,
: Thanks. I think it actually makes sense to finish off 1.1, and since
: there is overlap with the Nutch PMC and the Lucene PMC and since the
: thread started in Lucene before the TLP, I think it would be great e.g.,
Except that once the Board officilly passed the resolution
Hi Folks,
I have posted an updated candidate for the Apache Nutch 1.1 release. The
source code is at:
http://people.apache.org/~mattmann/apache-nutch-1.1/rc2/
The major difference between this release and rc #1 is the application of
NUTCH-812 - Crawl.java incorrectly uses the Generator API
Hi Grant,
I've attached one for Nutch from a while back that I made for a lecture I gave
at USC.
Cheers,
Chris
On 4/22/10 10:56 AM, Grant Ignersoll gsing...@apache.org wrote:
Hi All,
The ASF has been asked by an Industry Analyst group (to remain unnamed at this
point) to provide
Hi Sami,
I did not yet have time to functionally review the package but I
spotted couple of things:
-I ran rat (this should really be integrated to the build) and fixed
few java source files that were lacking license headers.
Saw that, thanks. I can cut a new RC later today with your
,
Chris
On 4/16/10 7:19 AM, Andrzej Bialecki a...@getopt.org wrote:
On 2010-04-16 15:53, Mattmann, Chris A (388J) wrote:
-Is the intent now to release just a source package? If so I think the
tutorial should be updated to reflect that change.
Yeah I was thinking since we primarily do
*nudge*
Hi guys, so far we have 2 +1 votes on this RC from myself and Andrzej --
another PMC member review would be great so I can push this release out...
Thanks!
Cheers,
Chris
On 4/9/10 9:19 AM, Andrzej Bialecki a...@getopt.org wrote:
On 2010-04-07 07:14, Mattmann, Chris A (388J) wrote
Hi Folks,
I have posted a candidate for the Apache Nutch 1.1 release. The source code
is at:
http://people.apache.org/~mattmann/apache-nutch-1.1/rc1/
See the included CHANGES.txt file for details on release contents and latest
changes. The release was made using the Nutch release process,
Hi All,
This VOTE has passed.
+1s:
Binding:
Chris A. Mattmann
Jukka Zitting
Grant Ignersoll
Uwe Schindler
Non-Binding:
Oleg Tikhonov
I'll get started pushing the releases out to the mirrors, and then send an
ANNOUNCE to annou...@.
Thanks, again, everyone!
Cheers,
Chris
to
include the sha1 of the src archive from jzitting. Will do on both, going
forward.
* +1 for having a direct link to tika-app on the website.
Cheers,
Chris
On 4/1/10 11:41 PM, Jukka Zitting jukka.zitt...@gmail.com wrote:
Hi,
On Wed, Mar 31, 2010 at 10:01 PM, Mattmann, Chris A (388J
Grant, FYI:
On 4/2/10 7:14 AM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
* Thanks for comments on the CHANGES from gsingers, and the mention to
include the sha1 of the src archive from jzitting. Will do on both, going
forward.
I added a stub for this in Tika 0.8:
http
- the same applies for sha1 files. ANT does it correctly, not
sure how you create the md5's with maven?
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
-Original Message-
From: Mattmann, Chris A (388J) [mailto:chris.a.mattm
Welcome, Uwe!
Cheers,
Chris
On 4/1/10 4:05 AM, Grant Ignersoll gsing...@apache.org wrote:
I'm pleased to announce that the Lucene PMC has voted to add Uwe Schindler to
the PMC. Uwe has been doing a lot of work in Lucene and Solr, including
several of the last releases in Lucene.
Please
Hi Grant,
+1, what do you think of these mods?
On 3/22/10 8:15 AM, Grant Ignersoll gsing...@apache.org wrote:
Riffing off of last years... Could use a little more meat, IMO, but I don't
think my brain is on yet.
snip
Apache Lucene is a robust and powerful, open source, search toolkit
Hi Grant,
One correction:
capabilities or the large scale crawling features of Nutch, chances are
Lucene
capabilities, or simply power web-scale search via Apache Nutch, changes are
Lucene
capabilities, or simply power web-scale search via Apache Nutch, chances are
(changes=chances)
projects as well as
other members of the rapidly growing Lucene community. We look
forward to seeing you there!
/snip
Cheers,
Chris
On 3/23/10 8:50 AM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov
wrote:
Hi Grant,
One correction:
capabilities or the large scale crawling features
What are the implications of this this new branding effort with the brands
for the existing Lucene and Solr? Will the names Lucene and Solr cease
in the mainstream in favor of a merged name?
Cheers,
Chris
On 3/22/10 11:02 AM, Steven A Rowe sar...@syr.edu wrote:
Now that Solr and Lucene live
Hey Grant,
Also, I really don't see what is so drastic about the proposal. All we're
doing is making it easier for code to be put in the right place. We're not
having Lucene consumed by Solr nor vice versa. As you've seen by the Board's
indication, they only view that there should be a
Hey Grant,
Yeah, I agree this would be a good thing and would make sense in the near
future. I think we should iron out the TLP stuff first and we should let
Nutch discuss what it's plans for modularization are. But yeah, I'm
definitely open to discussing it.
I think we sometimes also
Hey Patrick,
Actually Mark the problem with Spatial is that there hasn't been enough
folks involved in it as a project. I am a single point of failure for it. So
I have gone about solving that, by getting assistance from several experts
in this field to help put this proposal together
Hi Simon,
On 3/12/10 4:30 AM, Simon Willnauer simon.willna...@googlemail.com
wrote:
I don't think that is the case. A large amount of different concerns
are out there. Simply based on the amount of huge comments this
seems to be not a clearly passed vote.
simon
Agreed.
Cheers,
Chris
Hi Bernd,
On Fri, Mar 12, 2010 at 04:29, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Hi Yonik,
IMO, this vote has not passed. A bullet of this proposal proposes code
modifications and this is subject to VETO per Apache guidelines:
http://www.apache.org/foundation
Here's what I didn't like. The vote was:
* ambiguous
* something that the Solr devs tried to push through and bullied folks on
during discussion (those who originally had questions were persuaded that it
was the right thing to do by those in the PMC leadership).
* not healthy for the project
*
Hey All,
I started a thread over in tika-dev@ to discuss TLP. I'll take that feedback
after a few weeks and bring it to the larger community (and eventually PMC) to
discuss/vote. Key phrase: after a few weeks.
I'm VOTE'd out and am going to get back to work for a while. Take care.
Cheers,
Hi Yonik,
IMO, this vote has not passed. A bullet of this proposal proposes code
modifications and this is subject to VETO per Apache guidelines:
http://www.apache.org/foundation/voting.html#Veto
Since that point is up for debate, I think we can get clarification on this
from the board at their
Hey Grant,
On 3/9/10 5:49 AM, Grant Ingersoll gsing...@apache.org wrote:
For that matter, why do we even need to have this discussion at all? Most of
us Solr committers are Lucene committers. We can simply start committing Solr
code to Lucene such that in 6 months the whole discussion is
Hi Mike,
As someone who works on both, I don't think it is fine. Just look at the
function query mess. Just look at the version mess. It's very frustrating
as a developer and it makes me choose between two projects that I happen to
like equally, but for different reasons. If I worked on
Hi Yonik,
I have built 10s of projects that
have simply used Lucene as an API and had no need for Solr, and I've built
10s of projects where Solr made perfect sense. So, I appreciate their
separation.
As does everyone - which is why there will always be separate
downloads. As a user, the
Hi Robert,
2. duplicate the Lucene code in Solr, address any issues there, and then
contribute it back
Not that I can stop anything, but -1 to any further analysis code
duplication. There has to be a better way.
There might be, but as a first start, duplication is a quick way to get
Hi Robert,
There might be, but as a first start, duplication is a quick way to get
going and experiment. As solutions that evolve over time are matured, the
time can come for integration. Parallel tracks allows projects to move
forward operationally, and enforces insulation, loose coupling
Hey Yonik,
However, like I said it seems to be like
the discussion of the real issues is only happening recently over the past
few days.
This certainly isn't new territory for lucene/solr devs though - the
issue of what belongs in Solr and what belongs in Lucene, and problems
around
Haha it dates me too but that's OK I know I'm a young-un! :)
Cheers,
Chris
On 3/9/10 8:35 AM, Grant Ingersoll gsing...@apache.org wrote:
On Mar 9, 2010, at 11:29 AM, Mattmann, Chris A (388J) wrote:
Hey Yonik,
However, like I said it seems to be like
the discussion of the real issues
For completeness from the VOTE on private@
PMC votes:
+1
Mark Miller
Michael McCandless
Yonik Seely
Ryan McKinley
-0
Doug Cutting
-1
Dennis Kubes
Scott Ganyo
Chris Mattmann
Cheers,
Chris
On 3/8/10 6:11 PM, Yonik Seeley ysee...@gmail.com wrote:
Apoligies in
Yet the information we were voting on is public information really and this
doesn't really count as sensitive IMO.
Any thing I send to private@, I kind of count on not being public. I'd
rather you not decide that for me. In this case, I'm not terribly upset
that my private vote has gotten out
Hi All,
I'll preface my vote in this by mentioning I'm neither a Lucene(-java) or
Solr committer, but I am -1 for this proposal is in its current form, per my
comments referenced in the below URL during the existing discussions.
Cheers,
Chris
On 3/3/10 3:42 PM, Yonik Seeley yo...@apache.org
Hi All,
-1 for the same reasons I mentioned previously. Again, I'm wearing my
I'm-interested-in-this-discussion-but-not-a-Lucene/Solr-committer hat.
Cheers,
Chris
On 3/4/10 1:33 PM, Michael McCandless luc...@mikemccandless.com wrote:
A new vote, that slightly changes proposal from last vote
Hey Grant,
I¹d like to explore this does this imply that the Lucene sub-projects will
go away and Lucene will turn into Lucene-java and maintain its Apache TLP,
and then you¹d have say, solr.apache.org, tika.apache.org, mahout.apache.org
(already started), etc. etc.? If so, that may be the best
it. The separation likely didn't spring up over
night and has been this way for a while (as least to my knowledge). This is
exactly the type of situation that typically leads to TLP creation from what
I've seen.
Cheers,
Chris
On 03/01/2010 10:04 AM, Mattmann, Chris A (388J) wrote:
Hey Grant,
I¹d
status, and it
is seeming more and more like (at least) Solr does...
Cheers,
Chris
On Mar 1, 2010, at 7:04 AM, Mattmann, Chris A (388J) wrote:
Hey Grant,
I¹d like to explore this does this imply that the Lucene sub-projects will
go away and Lucene will turn into Lucene-java and maintain its
9:12 AM, Robert Muir rcm...@gmail.com wrote:
this will make the analyzers duplication problem even worse
On Mon, Mar 1, 2010 at 11:06 AM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Hi Mark,
Thanks for your message. I respect your viewpoint, but I respectfully
disagree
Hi Grant,
On Mar 1, 2010, at 8:20 AM, Mattmann, Chris A (388J) wrote:
Hi Robert,
I think my proposal (Solr-TLP) is sort of orthogonal to the whole analyzers
issue - I was in favor, at the very least, of having a separate
module/project/whatever that both Solr/Lucene (and whatever project
at 12:01 PM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Hi Grant,
On Mar 1, 2010, at 8:20 AM, Mattmann, Chris A (388J) wrote:
Hi Robert,
I think my proposal (Solr-TLP) is sort of orthogonal to the whole analyzers
issue - I was in favor, at the very least, of having a separate
in a way that we don't waste effort, and so that both direct
Lucene and Solr users could use it when it's released?
Mike
On Mon, Mar 1, 2010 at 1:07 PM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Hi Mike,
I'm not sure I follow this line of thinking: how would Solr being a TLP
/2010 at 1:28 PM, Mattmann, Chris A (388J) wrote:
http://incubator.apache.org/projects/sis.html
We're just starting to tackle that very issue right
now...patches/ideas/contributions welcome.
Patches? SVN https://svn.apache.org/repos/asf/incubator/sis/ looks empty ATM:
asf - Revision
Hey Hoss,
I support Mike's original suggestion of having a shared, independently
maintained/released analysis package for Nutch/Solr/Lucene. I emphatically do
not support merging Solr and Lucene in the way proposed.
Hope that clarifies things, at least from me.
Cheers,
Chris
On 3/1/10
, this issue.
Cheers,
Chris
On Mon, Mar 1, 2010 at 1:28 PM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
I'm glad that you brought that up! :)
Check out:
http://incubator.apache.org/projects/sis.html
We're just starting to tackle that very issue right
now...patches/ideas
Hi Mark,
Thanks for the feedback. My concern is that if the two communities are pretty
separate, then it is going to be more difficult merging them, and it's not
always a good thing to take separated modules (or communities) and integrate
them into a monolith, whether it be physically in the
,
Chris
On 1/27/10 12:01 PM, Ted Dunning ted.dunn...@gmail.com wrote:
+0
On Wed, Jan 27, 2010 at 7:26 AM, Grant Ingersoll gsing...@apache.orgwrote:
On Jan 20, 2010, at 1:56 AM, Mattmann, Chris A (388J) wrote:
Hi Folks,
I have posted a candidate for the Apache Tika 0.6 release
Welcome, Mark!
Cheers,
Chris
On 1/14/10 7:37 AM, Grant Ingersoll gsing...@apache.org wrote:
I'm pleased to announce the Lucene PMC has elected to add Mark Miller to its
ranks in recognition of his longstanding contributions to the Lucene community
as a committer on both Lucene Java and
Most of the atmospheric/climate/earth scientists that I work with refer to
these tiers as grid boxes.
I think you¹ll find different answers though, depending on who you ask. The
scientific community is a bit different that GIS/decision support folks...
Chris
On 12/28/09 9:49 AM, Ryan McKinley
Hi Patrick,
Interesting. It seems like there is a precedent already in the Local Lucene
and Local SOLR packages that define CartesianTier as lingua franca.
Like I said in an earlier email it depends on who you talk to regarding the
preference of what to call these Tiles/Grids/Tiers, etc., and
(...apologies for the cross posting...)
The Apache Lucene project is pleased to announce the release of Apache Tika
0.5. The release contents have been pushed out to the main Apache release
site and the m2 ibiblio sync, so the releases should be available as soon as
the mirrors get the syncs.
Hi All,
The vote passes. Here are the result tallies:
PMC Votes:
+1 Jukka Zitting
+1 Grant Ignersoll
+1 Chris Mattmann
Non-binding votes:
-1 Karl Heinz Marbaise
Yuan-Fang Li mentioned on TIKA-309 that he is still seeing the behavior,
even after the patch. However, I don't see the behavior
Hey Jukka,
The only issue I see with the release are the temporary Maven release
plugin files found in the source archive. Did you take the package
from target/ or from target/checkout/target/ after mvn
release:perform?
I took the zip archive from target, oops! Sorry about that -- my
Hi Folks,
I have posted a candidate for the Apache Tika 0.5 release. The source code
is at:
http://people.apache.org/~mattmann/apache-tika-0.5/rc1/
See the included CHANGES.txt file for details on release contents and latest
changes. The release was made using the Maven2 release plugin,
Hey Grant and everyone,
Thanks so much for voting me in. It's been a great time working within the
Apache Lucene community for the last 4-5 years and I really think the work is
innovative. Great job out there to all the sub-projects. It's really a
canonical reference architecture for building
(...apologies for the cross posting...)
The Apache Lucene project is pleased to announce the release of Apache Tika
0.4. The release contents have been pushed out to the main Apache release
site and the m2 ibiblio sync, so the releases should be available as soon as
the mirrors get the syncs.
Hey Jukka,
Some comments (none blocking):
* It would be good to have also a SHA1 checksum
(ad04d3e02be57a51b5f446c4f921d9280e5b11b9) of the release archive.
+1, done.
* As mentioned by Grant, it would be good to have you included in the
Apache Web of Trust. See
63 matches
Mail list logo