[
https://issues.apache.org/jira/browse/NUTCH-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500604#comment-14500604
]
Mattmann, Chris A (388J) commented on NUTCH-1927:
-
+1 please commit
[
https://issues.apache.org/jira/browse/NUTCH-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121477#comment-14121477
]
Mattmann, Chris A (388J) commented on NUTCH-1832:
-
Will reply in more
at 4:29 PM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Hey Kiran,
I think here:
http://wiki.apache.org/general/OurWikiFarm#per_wiki_access_control_-_tight
e
n_your_wiki_just_a_little.2C_benefit_just_a_lot
Cheers,
Chris
Thanks Kiran!
++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW: http://sunset.usc.edu/~mattmann/
Seconded!
++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW: http://sunset.usc.edu/~mattmann/
Hi Kiran,
Yes, my recommendation:
1. Jump into #asfinfra on freeonode, find Joe, or Gavin or Daniel,
ask for help. If you don't have IRC, email infrastruct...@apache.org
and/or file a https://issues.apache.org/jira/browse/INFRA ticket
2. Request that they enable ASAP ContributorsGroup only
Hey Julien,
I heard on #asfinfra that any of our MoinMoin wikis have been attacked recently
by SPAM.
I think we may want to contact infra and ask for specific ContributorsGroup
only Nutch wiki access.
http://wiki.apache.org/general/OurWikiFarm
Cheers,
Chris
From: Julien Nioche
a decent UI
running with functionalities.
Regards,
Kiran.
On Sat, Mar 23, 2013 at 2:33 PM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
That is so awesome Kiran.
Great job and I would love a link to your thesis (or even seeing the work
in progress)
if you are willing to share
Super +1 -- sounds awesome Lewis.
Cheers,
Chris
On 3/24/13 12:38 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com
wrote:
Hi All,
After some discussion and drumming up of interest within the Giraph
community, I've logged a Google Summer of Code issue [0] for this topic.
We are looking for
with
functionalities.
Regards,
Kiran.
On Sat, Mar 23, 2013 at 2:33 PM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.govmailto:chris.a.mattm...@jpl.nasa.gov wrote:
That is so awesome Kiran.
Great job and I would love a link to your thesis (or even seeing the work in
progress)
if you are willing
possible.
Thank you,
Kiran.
On Sat, Mar 23, 2013 at 11:23 AM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.govmailto:chris.a.mattm...@jpl.nasa.gov wrote:
Hi Kiran,
Great, yes the REST services need work for sure. They haven't been worked on in
a while.
I'm privy to Apache CXF, but I
Hey Guys,
I posted:
https://issues.apache.org/jira/browse/NUTCH-841
As a potential GSOC 2013 summer project. I'm willing to mentor it, since I
love
Wicket, and I'm willing to maintain the result as a Nutch committer.
If NUTCH-841 doesn't get selected, I'll start implementing it this summer
if
[Apologies for cross post]
Guys, to play in the GSoC 2013 spec, we just need to tag issues in JIRA
with the gsoc2013 tag.
I'll try and come up with few projects soon :)
Cheers,
Chris
On 3/15/13 11:15 AM, Luciano Resende luckbr1...@gmail.com wrote:
On Fri, Mar 15, 2013 at 11:01 AM, Manish
This is great to hear Kiran, welcome to the team!
Cheers,
Chris
From: Julien Nioche
lists.digitalpeb...@gmail.commailto:lists.digitalpeb...@gmail.com
Reply-To: dev@nutch.apache.orgmailto:dev@nutch.apache.org
dev@nutch.apache.orgmailto:dev@nutch.apache.org
Date: Sunday, March 10, 2013 2:15 PM
FYI
On 3/10/13 5:10 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com
wrote:
I just told a huge lie.
I got my dates mixed up...
Students have from between April 22nd and May 3rd to get proposals in.
Sorry about the mix up.
Lewis
On Sun, Mar 10, 2013 at 5:09 PM, Lewis John Mcgibbney
Hi Tejas,
Yeah I was having some issue at the time, but will try and see if it is working
tomorrow. If it's still not working we can contact infra@
Cheers,
Chris
From: Tejas Patil tejas.patil...@gmail.commailto:tejas.patil...@gmail.com
Reply-To: dev@nutch.apache.orgmailto:dev@nutch.apache.org
Hey Lewis,
Great job starting this thread. +1 Giraph is welcome here. Multi-project GSoCs
always do well.
One thing I had in mind was taking an implementation of Hubs and Authorities
developed for
Nutch 1.3 a few years back in my USC class and then having someone integrate it
into the
current
Hey Markus,
Yep my student implement HITS (on the fly) ranking, and classification (I
think).
It's sitting on my HD for 2 years :(
So if someone can pick it up it would be a nice GSoC project.
Glad to hear there is interest.
Cheers,
Chris
On 3/4/13 1:21 PM, Markus Jelsma
Hey Markus:
https://issues.apache.org/jira/browse/NUTCH-1539
Will submit the code soon.
Cheers,
Chris
On 3/4/13 1:43 PM, Markus Jelsma markus.jel...@openindex.io wrote:
Ah yes! Please open an issue and if you can attach anything that matters
such as a description of the algorithm, how it
Hi Shann,
Thank you for reaching out! If your goal is to get your project integrated
into Apache Nutch,
proper, then I would recommend simply:
0. File some JIRA issues in Apache Nutch
http://issues.apache.org/jira/browse/NUTCH Small incremental patches and
issues are preferred and this will let
[Sorry for cross posting]
Guys,
FYI please note that you can participate as a mentor from a PMC via Apache as
they are a GSoC org. ComDev will coordinate our participation but start
thinking about what projects we may want to do.
Cheers,
Chris
From: Carol Smith
I love it and will use it but don't think it needs to be a policy to each their
own :)
Thanks buddy
Sent from my iPhone
On Jan 31, 2013, at 3:58 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com
wrote:
Hi All,
I thought I would create this thread as the Review Board platform has
been
Hey Tejas,
Yeah I think this has to do with something in the repo URL on the RB server
side. I would file an INFRA ticket, or jump on #asfinfra on IRC and ask one of
the guys for help there.
Cheers,
Chris
From: Tejas Patil tejas.patil...@gmail.commailto:tejas.patil...@gmail.com
Reply-To:
woot yep ;)
On 12/21/12 2:55 AM, Markus Jelsma markus.jel...@openindex.io wrote:
forget it, i meant 1.7 but it's there already!
-Original message-
From:Markus Jelsma markus.jel...@openindex.io
Sent: Fri 21-Dec-2012 11:54
To: dev@nutch.apache.org dev@nutch.apache.org
Subject: 1.8 in
Thanks guys.
I should review this today.
Cheers,
Chris
On Nov 29, 2012, at 5:31 AM, Lewis John Mcgibbney wrote:
Hi,
On Wed, Nov 28, 2012 at 10:11 AM, Julien Nioche
lists.digitalpeb...@gmail.com wrote:
- CHANGES.txt contains dates in both MM/DD/ and DD/MM/ formats.
Shall we
Hey Lewis,
On Nov 29, 2012, at 5:54 AM, Lewis John Mcgibbney wrote:
Hi All,
Right now I found myself facing a bit of a dilemma w.r.t bumping on
the issues for the next Nutch release.
Currently due to legacy workflows, we have some 120 issues assigned
for 1.6... however ALL issues have
+50 :)
On Nov 29, 2012, at 8:32 AM, Lewis John Mcgibbney wrote:
So in summary,
We retain the legacy behavior and bump them ALL to 1.7
In the 1.7 development drive (if and when we can) we make an effort to act on
patched issues in an attempt to pick the low hanging fruit so to speak...
Release early, release often :)
I'd say I'd be happy to try and spin it, but you'd beat me to it so I just
will say I'll be happy to test the RC and voice my VOTE when you roll
it Lewis :)
Happy Thanksgiving (even though you're not in the States yet)!
Cheers,
Chris
On Nov 22, 2012, at 7:15
Great job everyone!
Cheers,
Chris
On Oct 5, 2012, at 9:29 AM, Julien Nioche wrote:
Thanks Lewis and well done everyone!
Enjoy your week end
Julien
On 5 October 2012 16:12, lewis john mcgibbney lewi...@apache.org wrote:
Good Afternoon Everyone,
The Apache Nutch PMC are very pleased
Thanks for your VOTE!
Cheers,
Chris
On Oct 4, 2012, at 1:08 AM, j.sulli...@thomsonreuters.com
j.sulli...@thomsonreuters.com wrote:
A bit late but my two cents. I have done a couple of installs on Ubuntu 12.04
using MySQL for the backend and have noticed a couple of the improvements and
no
Take care dude! I'll give trunk a shot...
Cheers,
Chris
On Sep 21, 2012, at 7:34 AM, Lewis John Mcgibbney wrote:
Hi All,
Basically thank god it was brought to our attention that
giora-cassandra 0.2.1 is buggy and needs some work before it is ready
to be integrated into a stable Nutch 2.x
Lewis you beat me to it, you ROCK!
Cheers,
Chris
On Sep 18, 2012, at 5:11 PM, lewi...@apache.org
lewi...@apache.org wrote:
Author: lewismc
Date: Tue Sep 18 21:11:06 2012
New Revision: 1387363
URL: http://svn.apache.org/viewvc?rev=1387363view=rev
Log:
forward port of NUTCH-1415
+1 I'd be happy to help!
Cheers,
Chris
On Sep 15, 2012, at 9:24 AM, Lewis John Mcgibbney wrote:
Hi Everyone,
Without me slevering on, this suggestion speaks for itself.
We have resolved 32 issues, including pulling in upgrades on the Gora
dependency. It would be nice to push these
. If you can do RM role it would be great.
Best
Lewis
On Sat, Sep 15, 2012 at 6:07 PM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
+1 I'd be happy to help!
Cheers,
Chris
On Sep 15, 2012, at 9:24 AM, Lewis John Mcgibbney wrote:
Hi Everyone,
Without me slevering
Great to hear, Julien, nice!
Cheers,
Chris
On Sep 13, 2012, at 3:39 AM, Julien Nioche wrote:
Hi,
I'd just like to mention that I will be giving a talk about Nutch at the
Apache Conference Europe (Sinsheim, Germany 5–8 November 2012). The Apache
Conference should be a good opportunity
/viewer.php#/detail/254365383887354210_4414285
http://statigr.am/viewer.php#/detail/254365383887354210_4414285
Cheers,
Jérôme
On Fri, Aug 10, 2012 at 1:44 AM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov mailto:chris.a.mattm...@jpl.nasa.gov
mailto:chris.a.mattm...@jpl.nasa.gov
FYI...
Begin forwarded message:
From: Nick Burch nick.bu...@alfresco.com
Date: July 19, 2012 1:14:57 PM CDT
To: committ...@apache.org
Subject: Call for Papers for ApacheCon Europe 2012 now open!
Reply-To: apachecon-disc...@apache.org
Hi All
We're pleased to announce that the Call for
Hi Markus,
Great question. I am CC'ing Ruth Duerr and Ian Truslove and Ruth Duerr at NSIDC
-- maybe they
can provide more information?
Ruth, ian, please consider subcribing to dev@nutch.apache.org and/or
u...@nutch.apache.org
by sending blank emails to:
dev-subscr...@nutch.apache.org
Congrats, all!
Cheers,
Chris
On Jul 10, 2012, at 8:03 AM, Julien Nioche wrote:
Great Job Lewis! Thanks a lot
On 10 July 2012 15:40, lewis john mcgibbney lewi...@apache.org wrote:
Good Afternoon Everyone,
The Apache Nutch PMC are very pleased to announce the release of
Apache Nutch
+1 from me.
Cheers,
Chris
On Jul 9, 2012, at 3:37 AM, Julien Nioche wrote:
Guys,
Now that we've released 2.0, wouldn't it be better to rename the 'nutchgora'
branch into something like 'branch-2.x'? Any thoughts on this?
Julien
--
Open Source Solutions for Text Engineering
Hi Lewis,
+1 from me!
SIGS check out:
[chipotle:~/tmp/nutch-1.5.1] mattmann% $HOME/bin/verify_md5_checksums
md5sum: stat '*.bz2': No such file or directory
apache-nutch-1.5.1-bin.tar.gz: OK
apache-nutch-1.5.1-src.tar.gz: OK
apache-nutch-1.5.1-bin.zip: OK
apache-nutch-1.5.1-src.zip: OK
Thanks for your hard work here, Lewis!
Cheers,
Chris
On Jul 7, 2012, at 3:44 PM, Lewis John Mcgibbney wrote:
Hi Julien,
Believe it or not I've just spent around 45 mins waiting on committing
the site... broadband in Paris is nothing short of utterly abysmal to
say the very best. Please
, at 11:24 AM, Mattmann, Chris A (388J) wrote:
Hey Lewis,
I was running ant test -- sorry -- will try ant runtime now (any idea
what's up with test?)
Cheers,
Chris
On Jul 3, 2012, at 11:11 AM, Lewis John Mcgibbney wrote:
What commands are you using?
I just grabbed the src-tar.gz
does not exist!
Build failed
Lewis
On Wed, Jul 4, 2012 at 7:18 AM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Hi Lewis,
Odd, I don't get that.
I'll try futzing around again with it tomorrow -- what system are you on?
What is
your Ant version and Java version
Hey Julien,
On Jul 3, 2012, at 7:49 AM, Julien Nioche wrote:
[..snip..]
OK, so basically signatures and checksums are fine
+1, yep they are great.
Tried to build and test and got this:
[ivy:resolve] ::
[..snip...]
Try
Hey Julien,
I ran this command: rm -rf /Users/mattmann/.ivy2/
But it still failed with the below messages:
[ivy:resolve] :: problems summary ::
[ivy:resolve] WARNINGS
[ivy:resolve] [FAILED ]
org.apache.hadoop#hadoop-core;1.0.3!hadoop-core.jar: invalid sha1:
I'll try to scope this by tomorrow...thanks Lewis.
Cheers,
Chris
On Jul 2, 2012, at 10:49 AM, Lewis John Mcgibbney wrote:
Anyone else for this RC?
I've been slighyl distracted with a number of things recently and only
just getting round to following this one up so apologies about that.
Hey Guys,
(sorry for the top post)
There's no reason to freeze trunk during releases. In fact, during the RC, once
the branch (or tag for that matter)
is created, trunk can continue on, no need to stop. Heck, we can always just
tag or branch from a specific
revision too so it's not really a
+1!
Cheers,
Chris
On Jun 19, 2012, at 2:26 AM, Julien Nioche wrote:
Quite annoying that we did not spot this before releasing. What about a 1.5.1
soonish with this fix + couple smallish improvements e.g. upgrade to Hadoop
1.0.3?
J.
-- Forwarded message --
From:
with only releasing src.
On Thu, Jun 14, 2012 at 11:32 PM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Or just not ship a bin release at all. Src is the only thing we really VOTE
on legally though bin is provided for convenience purposes. Will type more on
this later
Hey Guys,
I think the annoyance is probably something folks can live with as they have
been
waiting for an official release of 2.x for years :)
My +1 to roll RC #2 with or without a solution to this and mark it as a TODO.
release
eary, release often :)
Cheers,
Chris
On Jun 14, 2012, at 10:04
On 14 June 2012 20:27, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.govmailto:chris.a.mattm...@jpl.nasa.gov wrote:
Hey Guys,
I think the annoyance is probably something folks can live with as they have
been
waiting for an official release of 2.x for years :)
My +1 to roll RC #2
+1 to the description w/o experimental too (I agree with Ferdy).
You guys ROCK.
Cheers,
Chris
On Jun 13, 2012, at 5:29 AM, Lewis John Mcgibbney wrote:
Hi,
Seeing as we have the ball rolling with the 2.0 RC. I thought I'd ask
about a suitable project descriptor.
So far on trunk we have
Hey Lewis,
I will get to this tonight, for sure.
Thanks!
Cheers,
Chris
On Jun 12, 2012, at 1:16 PM, Lewis John Mcgibbney wrote:
Hi Everyone,
I appreciate that most of the core dev's are using trunk, however I
would appeal to you guys to at least check out the artifacts and check
sigs,
Hey Guys,
#2 is probably reason enough for a respin.
Lewis if you don't have time to do it before Thursday, I could probably
give it a whack. Let me know.
Cheers,
Chris
On Jun 12, 2012, at 3:33 PM, Sebastian Nagel wrote:
Hi Lewis,
my first steps with 2.0 (to be continued, still
Hey Lewis,
+1 from me!
SIGS check out:
[chipotle:nutch-dev/1.5-release/rc4] mattmann% ls
apache-nutch-1.5-bin.tar.gz apache-nutch-1.5-bin.zip
apache-nutch-1.5-src.tar.gz apache-nutch-1.5-src.zip
apache-nutch-1.5-bin.tar.gz.asc apache-nutch-1.5-bin.zip.asc
Hey Guys,
Does this warrant a respin, or are you +1 Juls?
Cheers,
Chris
On May 31, 2012, at 1:44 AM, Julien Nioche wrote:
Hi Lewis,
Minor nitpick : the directory /runtime is not necessary as it is built with
ANT. Removing it would massively reduce the size of the archive. Could we fix
2012 15:24, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Hey Guys,
Does this warrant a respin, or are you +1 Juls?
Cheers,
Chris
On May 31, 2012, at 1:44 AM, Julien Nioche wrote:
Hi Lewis,
Minor nitpick : the directory /runtime is not necessary as it is built
this comply with release policy?
Thanks
Lewis
On Thu, May 31, 2012 at 3:49 PM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
okey dokey.
I will try and take the time to review the RC today. Thanks for pushing
this Lewis!
Cheers,
Chris
On May 31, 2012, at 7:36 AM, Julien
+1 happy for Lewis to try I've been swamped!
Sent from my iPhone
On May 22, 2012, at 2:16 AM, Julien Nioche
lists.digitalpeb...@gmail.commailto:lists.digitalpeb...@gmail.com wrote:
Hi Lewis,
I am sure that Chris will have no problem with you doing the RC2. Chris? It
would be a good thing to
+1
Sent from my iPhone
On May 22, 2012, at 4:43 AM, Lewis John Mcgibbney
lewis.mcgibb...@gmail.commailto:lewis.mcgibb...@gmail.com wrote:
Hi,
As I say, I am able to stick time in tonight to roll this RC, however does
anyone have a problem with me rolling the 2.0 RC tonight after the 1.5RC2?
Hey Julien,
On May 9, 2012, at 3:11 AM, Julien Nioche wrote:
Hi Chris
Any chance you could do a RC2 for the trunk soonish? We've been a bit stuck
since mid April and it would be nice to move on. If not I can try and spin a
RC myself but it is likely to be hilarious :-)
Haha, no worries.
, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Hi Guys,
++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm
Hey Lewis,
On Apr 17, 2012, at 3:35 AM, Lewis John Mcgibbney wrote:
3) We previously discussed implementing the Any23 parser plugin as a tika
wrapper, therefore it would look very similar to parse-tika?
I think it would be super awesome to add the Any23 parsing functionality as a
Tika
Hi Julien,
On Apr 16, 2012, at 2:02 AM, Julien Nioche wrote:
Thanks Chris,
-1 the versions of the deps for hadoop, tika and possibly others are not
correct in the pom.xml found in the src archive and on the mvn repository,
which will be a problem for whoever tries to use the pom.xml
Hey Sami,
Thanks. I'll fix the 4 license headers you mention below as part of RC #2.
Cheers,
Chris
On Apr 16, 2012, at 3:02 AM, Sami Siren wrote:
On Mon, Apr 16, 2012 at 8:43 AM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Hi Folks,
A candidate for the Nutch 1.5 release
file for RC #2 as you mention below
and not sure why the extension was .tar.gz.tar.gz, I'll fix that too.
Cheers,
Chris
On Apr 16, 2012, at 3:12 AM, Lewis John Mcgibbney wrote:
Hi Chris,
On Mon, Apr 16, 2012 at 6:43 AM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Hi
Hi Folks,
A candidate for the Nutch 1.5 release is available at:
http://people.apache.org/~mattmann/apache-nutch-1.5/rc1/
The release candidate is a zip and tar.gz archive of the sources in:
http://svn.apache.org/repos/asf/nutch/tags/release-1.5/
And a binary build suitable for
Julien
On 3 April 2012 15:30, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Thanks Lewis!
Cheers,
Chris
P.S. Hopefully by this weekend...
On Apr 3, 2012, at 7:23 AM, Lewis John Mcgibbney wrote:
Hi,
On Tue, Apr 3, 2012 at 3:12 PM, Markus Jelsma markus.jel
Hi Markus,
On Apr 3, 2012, at 5:50 AM, Markus Jelsma wrote:
Cool!
Next time i'll ask infra to allow to supress notifications.
Chris, will you RM one RC? And if possible list the detailed steps/command in
the process in case you don't have to time RM 1.6 when the time comes. The
wiki
Thanks Lewis!
Cheers,
Chris
P.S. Hopefully by this weekend...
On Apr 3, 2012, at 7:23 AM, Lewis John Mcgibbney wrote:
Hi,
On Tue, Apr 3, 2012 at 3:12 PM, Markus Jelsma markus.jel...@openindex.io
wrote:
Seems fine. Only updating KEYS is no longer necessary.
Now sorted.
Thanks
Hey Guys,
I've got some cycles this weekend -- anyone up for a 1.5 release off trunk
(stable), and
a NutchGora branch release? I suggested this before [1] regarding NutchGora.
I'm inclined to say let's do the following:
1. NutchGora: apache-nutch-2.0 - release 2.x series based on this branch
2.
, 2012 at 3:03 PM, Markus Jelsma markus.jel...@openindex.io
wrote:
+1
1.5 has, again, many fixes and improvements, just as 1.4 had over 1.3. But i'd
like to integrate Tika 1.1 after its pending release.
Cheers
On Thursday 08 March 2012 15:38:15 Mattmann, Chris A (388J) wrote:
Hey Guys
Guys, FYI...in case anyone is thinking of GSoC, deadlines are approaching.
Process
is described below...
Thanks!
Cheers,
Chris
Begin forwarded message:
From: Ulrich Stärk u...@apache.org
Date: March 4, 2012 9:01:07 AM PST
To: p...@apache.org p...@apache.org
Cc: d...@community.apache.org
FYI...awesome!
Begin forwarded message:
From: Jason Trost jason.tr...@gmail.com
Date: February 28, 2012 5:41:23 PM PST
To: common-u...@hadoop.apache.org common-u...@hadoop.apache.org
Subject: [blog post] Accumulo, Nutch, and Gora
Reply-To: common-u...@hadoop.apache.org
+1 guys. Just let me know when you are ready and I can RM it.
Cheers,
Chris
On Feb 20, 2012, at 8:01 AM, Lewis John Mcgibbney wrote:
Hi,
Not ignoring Chris' comments, but addressing the points below first, please
see comments.
On Mon, Feb 20, 2012 at 2:57 PM, Ferdy Galema
Hey Lewis,
I'd be +1 to roll a Nutchgora 2.0 release.
I could see dealing with this in two ways, neither of which I like better than
the other:
1. Release the nutchgora branch as apache-nutch-2.0, and then nutchgora
becomes
the 2.0 branch of the system (and we could create branch-2.0) The 1.x
Any Nutch Devs interested in a GSoC student?
Begin forwarded message:
From: Luciano Resende luckbr1...@gmail.com
Date: February 4, 2012 10:40:03 AM PST
To: d...@community.apache.org d...@community.apache.org, code-awards
code-awa...@apache.org
Subject: Fwd: [Announce] Google Summer of Code
FYI
Begin forwarded message:
From: Ross Gardler rgard...@opendirective.com
Date: February 5, 2012 1:45:18 PM PST
To: d...@community.apache.org d...@community.apache.org
Subject: RE: [Announce] Google Summer of Code 2012
Reply-To: d...@community.apache.org d...@community.apache.org
For
, we also explicitly filter out all/most unwanted suffixes.
We do have a lot of suffixes that we encountered so far.
On Saturday 28 January 2012 03:01:26 Mattmann, Chris A (388J) wrote:
(sorry for the cross post)
Hey Guys,
I'm trying to find a good citation or estimate (if anyone has
Hi Ken,
On Jan 21, 2012, at 10:33 AM, Ken Krugler wrote:
My own personal favorite area would be to integrate with crawler-commons.
+1. Would you crawler-commons guys be interested in bringing that code to
Apache?
How about bringing it over to Nutch?
Would that be something you'd be
Yay, all I heard was that it's building again woo hoo!
On Jan 6, 2012, at 9:03 AM, Markus Jelsma wrote:
Ah, i get 88 warnings now but things build fine. This is indeed quite more
verbose :)
On Tuesday 27 December 2011 17:28:31 Lewis John McGibbney (Commented) (JIRA)
wrote:
[
Merry Christmas buddy!
Cheers,
Chris
On Dec 25, 2011, at 9:14 AM, Lewis John Mcgibbney wrote:
Hi Guys,
Our trunk builds have been broken since migrating to new Hadoop 0.20.2
and migrating CrawlDBScanner to new MR API e.g. trunk build [1] 1698.
Looking to the stack trace, I'm assuming that
+1 from me -- those 3 Tika content handlers should take care of it...
Cheers,
Chris
On Dec 21, 2011, at 6:51 AM, Markus Jelsma wrote:
Hi,
For using Boilerpipe we need LinkCH, BoilerpipeCH and TeeCH in Tika. LinkCH
returns all URL's with some meta data such as title etc. Fixes for old
Hi Lewis,
+1 from me to the update and to logging a JIRA issue. Always nice to see
an associated changelog entry for any (even non trivial) updates, short of
typos and error corrections in docs/etc. Up to you though, since you're the one
doing the work :-)
Cheers,
Chris
On Dec 12, 2011, at
Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov schreef:
OK, of course, I figured it out, and updated my program :-)
You can see it on Github below. I'm going to clean up and
generalize this program because I think it's of general use.
I'll create an issue shortly.
I'm thinking
Hey Guys,
So, I've completed my crawl of the vault.fbi.gov website for my class that I'm
preparing
for. I've got:
[chipotle:local/nutch/framework] mattmann% du -hs crawl
28Gcrawl
[chipotle:local/nutch/framework] mattmann%
[chipotle:local/nutch/framework] mattmann% ls -l crawl/segments/
have to M/R this.
Just wanted to let you guys know where I'm at, and what
I've been trying.
Thanks,
Chris
On Nov 28, 2011, at 7:23 PM, Mattmann, Chris A (388J) wrote:
Hey Guys,
So, I've completed my crawl of the vault.fbi.gov website for my class that
I'm preparing
for. I've got
files that were downloaded by Nutch.
Do you guys see this as a useful tool? If so, I'll contribute it this week for
1.5.
Cheers,
Chris
On Nov 28, 2011, at 7:32 PM, Mattmann, Chris A (388J) wrote:
Hey Guys,
One more thing. Just to let you know I've followed this blog here:
http
Hi Everyone,
This VOTE has passed:
+1 PMC
Julien Nioche
Markus Jelsma
Lewis John McGibbney
Chris Mattmann
I'll go ahead and update the website and push the release out to the mirrors.
Thanks
for VOTE'ing and for your patience!
Cheers,
Chris
(...apologies for the cross posting...)
The Apache Nutch project is pleased to announce the release of Apache Nutch
1.4. The release contents have been pushed out to the main Apache release
site so the releases should be available as soon as the mirrors get the
syncs.
Apache Nutch is an
Mattmann, Chris A (388J) wrote:
Hi Markus,
On Nov 24, 2011, at 12:03 PM, Markus Jelsma wrote:
So, what's the point of that initial if(...) block outside of the for
loop. Isn't it redundant?
This is trunk? I've been and still am working on some issues for a new
feature in this part
...after I get back from Thanksgiving dinner :-)
1. In URLFilterChecker, the cmd line tool requires URLs to be fed into it on
STDIN, but
that isn't documented anywhere, even in the tool help printed to STDOUT. I'll
fix that.
2. In ParseOutputFormat, I see a code block:
{code}
//
Hey PJ,
On Nov 22, 2011, at 10:47 AM, PJ Herring wrote:
Hey Chris,
Thanks for the response. I looked at the documents you sent me, and I really
do think incorporating some kind of DI Framework could be a great addition to
Nutch.
I have a general plan of attack, but I'll try to write
Hey PJ,
You aren't being an ass at all. You're asking an important question, and
something I've been interested in for a while.
Here are some relevant threads to take a look at:
http://wiki.apache.org/nutch/Nutch2Architecture
+1 from me, Lewis, great work.
Cheers,
Chris
On Nov 19, 2011, at 4:11 AM, Lewis John Mcgibbney wrote:
Hi,
Please see here [1], and associated issue logged in Nucth Jira [2]. As I
explain in the issue, although Gora trunk is not stable there is ongoing
work to fix this.
Thanks for now
Awesome news, great to hear!
Cheers,
Chris
On Nov 17, 2011, at 8:57 AM, Lewis John Mcgibbney wrote:
Hi,
Some more positives here.
Lewis
-- Forwarded message --
From: Pietro Borradori pietro.borrad...@similarpages.com
Date: Thu, Nov 17, 2011 at 4:46 PM
Subject: Fw:
Hadoop _SUCCESS
file (markus)
Thanks
Julien
On 9 November 2011 10:21, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Hi Julien,
Thanks. OK, so I will respin an RC for 1.4 that
fixes the naming screw up. I already created the KEYS file
so we're fine
+1 to the GUI comment, even though I haven't made one yet, it's definitely on
my list of items should I find the cycles to do more besides releasing.
Thanks!
Cheers,
Chris
On Nov 15, 2011, at 1:01 PM, Markus Jelsma wrote:
Hi Guys,
During ApacheCon I made a point of trying to gauge how
WOOT!
Lewis and I talked about updating this at ApacheCon NA and I sent him the OODT
release guide and he's
done a masterful job updating ours.
Thanks Lewis you rock man.
Cheers,
Chris
On Nov 15, 2011, at 1:56 PM, Lewis John Mcgibbney wrote:
Hi guys,
Please see here [1] for my attempt
1 - 100 of 189 matches
Mail list logo