Re: need suggestion for GSoC 2016
Hi Lewis, My nutch wiki user name is "AmmarShadiq". My interest with nutch so far would be precise crawling. \Ammar On Sat, Jan 23, 2016 at 1:49 AM, Lewis John Mcgibbney < lewis.mcgibb...@gmail.com> wrote: > Hi Ammar, > CC dev@ > Apologies I must have missed the post! > Well... I've created a new entry on the wiki for you to register your > interest. Please provide me with your wiki username and I'll grant you > write access to the wiki. > It would be great if we could hash out here what you are interested in and > what would make a good project. > Lets do a but of brainstorming here and see where we get. > Lewis > > > On Fri, Jan 22, 2016 at 2:59 PM, Ammar Shadiq > wrote: > >> Hi Lewis, >> >> I've write to dev list several month ago ( >> http://www.mail-archive.com/dev%40nutch.apache.org/msg19783.html) and >> doesn't have any reply so far. >> I would appreciate use your suggestion. >> >> Warmest regards >> Ammar Shadiq >> >> On Tue, Nov 3, 2015 at 3:28 AM, Lewis John Mcgibbney < >> lewis.mcgibb...@gmail.com> wrote: >> >>> Hi Ammar, >>> I have a few suggestions but in all honesty I would write to the Nutch >>> dev@ list and ask there. >>> The PMC have not really started thinking about GSoC yet so your >>> conversation would be really good. >>> Let's take it from there. >>> >>> >>> On Sunday, November 1, 2015, Ammar Shadiq >>> wrote: >>> Hi Lewis, Several years ago I've submitted GSoC proposal for development of Nutch screen scrapper plugin https://issues.apache.org/jira/browse/NUTCH-978 and couldn't re-participate for GSoc 2012 because i'm not a student anymore. But i'm currently pursuing my master degree, and eligible to participate for the next year GSoC. I'm interested in contributing for apache Nutch for GSoC 2016 and i need suggestion for what project/feature available. could you give any advice? -- Thank you, Ammar Shadiq http://ammarshadiq.web.id/ >>> >>> >>> -- >>> *Lewis* >>> >>> >> >> >> -- >> Thank you, >> Ammar Shadiq >> http://ammarshadiq.web.id/ >> > > > > -- > *Lewis* > -- Thank you, Ammar Shadiq http://ammarshadiq.web.id/
[jira] [Commented] (NUTCH-1741) Support of Sitemaps in Nutch 2.x
[ https://issues.apache.org/jira/browse/NUTCH-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15113423#comment-15113423 ] Lewis John McGibbney commented on NUTCH-1741: - I'm nearly finished updating v6 patch for 2.X and will commit once this is done. This was not suitable for inclusion in 2.3.1 as it was not a big fix. It is now good for inclusion in 2.4. > Support of Sitemaps in Nutch 2.x > > > Key: NUTCH-1741 > URL: https://issues.apache.org/jira/browse/NUTCH-1741 > Project: Nutch > Issue Type: New Feature > Components: fetcher, generator >Reporter: Alparslan Avcı >Assignee: cihad güzel > Labels: gsoc2015 > Fix For: 2.4 > > Attachments: NUTCH-1741-v2.patch, NUTCH-1741-v3.patch, > NUTCH-1741-v4.patch, NUTCH-1741.patch, NUTCH-1741v5.patch, > NUTCH-1741v6.patch, SitemapCrawlerLifeCycle.pdf, SitemapDevelopmentFor2x.pdf > > > Sitemap support has to be implemented for 2.x branch. It is being discussed > in NUTCH-1465 for trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (NUTCH-1741) Support of Sitemaps in Nutch 2.x
[ https://issues.apache.org/jira/browse/NUTCH-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1741: Assignee: cihad güzel > Support of Sitemaps in Nutch 2.x > > > Key: NUTCH-1741 > URL: https://issues.apache.org/jira/browse/NUTCH-1741 > Project: Nutch > Issue Type: New Feature > Components: fetcher, generator >Reporter: Alparslan Avcı >Assignee: cihad güzel > Labels: gsoc2015 > Fix For: 2.4 > > Attachments: NUTCH-1741-v2.patch, NUTCH-1741-v3.patch, > NUTCH-1741-v4.patch, NUTCH-1741.patch, NUTCH-1741v5.patch, > NUTCH-1741v6.patch, SitemapCrawlerLifeCycle.pdf, SitemapDevelopmentFor2x.pdf > > > Sitemap support has to be implemented for 2.x branch. It is being discussed > in NUTCH-1465 for trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: need suggestion for GSoC 2016
Hi Ammar, CC dev@ Apologies I must have missed the post! Well... I've created a new entry on the wiki for you to register your interest. Please provide me with your wiki username and I'll grant you write access to the wiki. It would be great if we could hash out here what you are interested in and what would make a good project. Lets do a but of brainstorming here and see where we get. Lewis On Fri, Jan 22, 2016 at 2:59 PM, Ammar Shadiq wrote: > Hi Lewis, > > I've write to dev list several month ago ( > http://www.mail-archive.com/dev%40nutch.apache.org/msg19783.html) and > doesn't have any reply so far. > I would appreciate use your suggestion. > > Warmest regards > Ammar Shadiq > > On Tue, Nov 3, 2015 at 3:28 AM, Lewis John Mcgibbney < > lewis.mcgibb...@gmail.com> wrote: > >> Hi Ammar, >> I have a few suggestions but in all honesty I would write to the Nutch >> dev@ list and ask there. >> The PMC have not really started thinking about GSoC yet so your >> conversation would be really good. >> Let's take it from there. >> >> >> On Sunday, November 1, 2015, Ammar Shadiq wrote: >> >>> Hi Lewis, >>> >>> Several years ago I've submitted GSoC proposal for development of Nutch >>> screen scrapper plugin https://issues.apache.org/jira/browse/NUTCH-978 and >>> couldn't >>> re-participate for GSoc 2012 because i'm not a student anymore. But i'm >>> currently pursuing my master degree, and eligible to participate for the >>> next year GSoC. I'm interested in contributing for apache Nutch for GSoC >>> 2016 and i need suggestion for what project/feature available. could you >>> give any advice? >>> >>> -- >>> Thank you, >>> Ammar Shadiq >>> http://ammarshadiq.web.id/ >>> >> >> >> -- >> *Lewis* >> >> > > > -- > Thank you, > Ammar Shadiq > http://ammarshadiq.web.id/ > -- *Lewis*
[Nutch Wiki] Trivial Update of "GoogleSummerOfCode" by LewisJohnMcgibbney
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification. The "GoogleSummerOfCode" page has been changed by LewisJohnMcgibbney: https://wiki.apache.org/nutch/GoogleSummerOfCode?action=diff&rev1=14&rev2=15 === Ideas === You can see GSoC ideas from [[https://wiki.apache.org/nutch/GoogleSummerOfCode/Ideas|this page.]] == Projects == + + === 2016 === + List of accepted projects for GSoC 2016 are listed below. Both students and mentors are encouraged to sign up as well as [[http://nutch.apache.org/mailing_lists.html|discuss ideas on the community mailing lists]]. + + ||'''Student'''||'''Project'''||'''Mentor(s)'''|| + + + + - {{attachment:gsoc2015.png}} === 2015 ===
[jira] [Commented] (NUTCH-2171) Upgrade Nutch Trunk to Java 1.8
[ https://issues.apache.org/jira/browse/NUTCH-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15113380#comment-15113380 ] Lewis John McGibbney commented on NUTCH-2171: - Hey [~jorgelbg] feel free to assign this to yourself. It would be a reasonably large patch touching a number of files but it would be a real valuable contribution. > Upgrade Nutch Trunk to Java 1.8 > --- > > Key: NUTCH-2171 > URL: https://issues.apache.org/jira/browse/NUTCH-2171 > Project: Nutch > Issue Type: Task >Reporter: Lewis John McGibbney > > Lambda expressions are fantastic. I tried to undertake a small exercise which > would indicate how many we could implement however this was a fruitless > effort. A patch is going to be a better approach. This task involves > upgrading various properties in default.properties as well as a systemic > source code analysis with the aim of implementing Java 8 goodies throughout. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2204) Remove junit lib from runtime
[ https://issues.apache.org/jira/browse/NUTCH-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15113206#comment-15113206 ] Hudson commented on NUTCH-2204: --- SUCCESS: Integrated in Nutch-trunk #3341 (See [https://builds.apache.org/job/Nutch-trunk/3341/]) NUTCH-2204 : revert erroneous commit (snagel: [http://svn.apache.org/viewvc/nutch/trunk/?view=rev&rev=1726318]) * trunk/conf/regex-normalize.xml.template NUTCH-2204 Remove junit lib from runtime (snagel: [http://svn.apache.org/viewvc/nutch/trunk/?view=rev&rev=1726314]) * trunk/CHANGES.txt * trunk/conf/regex-normalize.xml.template * trunk/ivy/ivy.xml > Remove junit lib from runtime > - > > Key: NUTCH-2204 > URL: https://issues.apache.org/jira/browse/NUTCH-2204 > Project: Nutch > Issue Type: Improvement > Components: build >Affects Versions: 1.11 >Reporter: Sebastian Nagel >Priority: Trivial > Fix For: 1.12 > > Attachments: NUTCH-2204.patch > > > The junit library is shipped in the Nutch bin package as an unnecessary > dependency (apache-nutch-1.11/lib/junit-3.8.1.jar). Unit tests use a > different library version: > {noformat} > % ls build/lib/junit* build/test/lib/junit* > build/lib/junit-3.8.1.jar build/test/lib/junit-4.11.jar > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2171) Upgrade Nutch Trunk to Java 1.8
[ https://issues.apache.org/jira/browse/NUTCH-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15113192#comment-15113192 ] Jorge Luis Betancourt Gonzalez commented on NUTCH-2171: --- Perhaps an approach using checkstyle could be useful, combined with this recipe http://www.puppycrawl.com/blog/2015/09/03/checkstyle-force-lambdas.html could help us move forward. This could address at least the code analysis part. > Upgrade Nutch Trunk to Java 1.8 > --- > > Key: NUTCH-2171 > URL: https://issues.apache.org/jira/browse/NUTCH-2171 > Project: Nutch > Issue Type: Task >Reporter: Lewis John McGibbney > > Lambda expressions are fantastic. I tried to undertake a small exercise which > would indicate how many we could implement however this was a fruitless > effort. A patch is going to be a better approach. This task involves > upgrading various properties in default.properties as well as a systemic > source code analysis with the aim of implementing Java 8 goodies throughout. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (NUTCH-2204) remove junit lib from runtime
[ https://issues.apache.org/jira/browse/NUTCH-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2204. Resolution: Fixed Committed to trunk, r1726318. > remove junit lib from runtime > - > > Key: NUTCH-2204 > URL: https://issues.apache.org/jira/browse/NUTCH-2204 > Project: Nutch > Issue Type: Improvement > Components: build >Affects Versions: 1.11 >Reporter: Sebastian Nagel >Priority: Trivial > Fix For: 1.12 > > Attachments: NUTCH-2204.patch > > > The junit library is shipped in the Nutch bin package as an unnecessary > dependency (apache-nutch-1.11/lib/junit-3.8.1.jar). Unit tests use a > different library version: > {noformat} > % ls build/lib/junit* build/test/lib/junit* > build/lib/junit-3.8.1.jar build/test/lib/junit-4.11.jar > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (NUTCH-2204) Remove junit lib from runtime
[ https://issues.apache.org/jira/browse/NUTCH-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2204: --- Summary: Remove junit lib from runtime (was: remove junit lib from runtime) > Remove junit lib from runtime > - > > Key: NUTCH-2204 > URL: https://issues.apache.org/jira/browse/NUTCH-2204 > Project: Nutch > Issue Type: Improvement > Components: build >Affects Versions: 1.11 >Reporter: Sebastian Nagel >Priority: Trivial > Fix For: 1.12 > > Attachments: NUTCH-2204.patch > > > The junit library is shipped in the Nutch bin package as an unnecessary > dependency (apache-nutch-1.11/lib/junit-3.8.1.jar). Unit tests use a > different library version: > {noformat} > % ls build/lib/junit* build/test/lib/junit* > build/lib/junit-3.8.1.jar build/test/lib/junit-4.11.jar > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2204) remove junit lib from runtime
[ https://issues.apache.org/jira/browse/NUTCH-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15113021#comment-15113021 ] Julien Nioche commented on NUTCH-2204: -- +1 > remove junit lib from runtime > - > > Key: NUTCH-2204 > URL: https://issues.apache.org/jira/browse/NUTCH-2204 > Project: Nutch > Issue Type: Improvement > Components: build >Affects Versions: 1.11 >Reporter: Sebastian Nagel >Priority: Trivial > Fix For: 1.12 > > Attachments: NUTCH-2204.patch > > > The junit library is shipped in the Nutch bin package as an unnecessary > dependency (apache-nutch-1.11/lib/junit-3.8.1.jar). Unit tests use a > different library version: > {noformat} > % ls build/lib/junit* build/test/lib/junit* > build/lib/junit-3.8.1.jar build/test/lib/junit-4.11.jar > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (NUTCH-2204) remove junit lib from runtime
[ https://issues.apache.org/jira/browse/NUTCH-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2204: --- Attachment: NUTCH-2204.patch > remove junit lib from runtime > - > > Key: NUTCH-2204 > URL: https://issues.apache.org/jira/browse/NUTCH-2204 > Project: Nutch > Issue Type: Improvement > Components: build >Affects Versions: 1.11 >Reporter: Sebastian Nagel >Priority: Trivial > Fix For: 1.12 > > Attachments: NUTCH-2204.patch > > > The junit library is shipped in the Nutch bin package as an unnecessary > dependency (apache-nutch-1.11/lib/junit-3.8.1.jar). Unit tests use a > different library version: > {noformat} > % ls build/lib/junit* build/test/lib/junit* > build/lib/junit-3.8.1.jar build/test/lib/junit-4.11.jar > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (NUTCH-2204) remove junit lib from runtime
Sebastian Nagel created NUTCH-2204: -- Summary: remove junit lib from runtime Key: NUTCH-2204 URL: https://issues.apache.org/jira/browse/NUTCH-2204 Project: Nutch Issue Type: Improvement Components: build Affects Versions: 1.11 Reporter: Sebastian Nagel Priority: Trivial Fix For: 1.12 The junit library is shipped in the Nutch bin package as an unnecessary dependency (apache-nutch-1.11/lib/junit-3.8.1.jar). Unit tests use a different library version: {noformat} % ls build/lib/junit* build/test/lib/junit* build/lib/junit-3.8.1.jar build/test/lib/junit-4.11.jar {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)