[ 
https://issues.apache.org/jira/browse/NUTCH-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144099#comment-13144099
 ] 

Chris A. Mattmann commented on NUTCH-1098:
------------------------------------------

Guys: let's change the tone of this issue, OK?

Radim, thanks for your patch. Sorry that it didn't get applied or that folks 
tried to engage in feedback/discussion with you on it. I would encourage you to 
not get discouraged and I appreciate your effort in trying to contribute to the 
Apache Nutch project.

The committers are the ones that have to figure out how to maintain things and 
sometimes we get hung up on yes I'll agree less important issues. I'm going to 
recommend that everyone just table those at the moment and that we move forward 
here. 

Here are some concrete next steps:

1. Ferdy: is it possible to commit a portion of this patch that you do 
understand? Then we could leave the part that you don't uncommitted. This has 2 
immediate goals:
  - gives Radim a good feeling for contributing to the project -- he deserves 
that.
  - gives us the ability to cherry pick what we understand and are willing to 
maintain

2. Radim: if you want to help in improving the formatting and other requested 
issues, great. If you don't then that's fine too. At that point though the 
maintenance/evolution of the patch will transition more into the Nutch folks 
and you might not be as involved with it unless you get on board with what the 
guys have decided are their code formatting and patch generation guidelines.

Thanks!




                
> better url-normalizer basic
> ---------------------------
>
>                 Key: NUTCH-1098
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1098
>             Project: Nutch
>          Issue Type: Improvement
>          Components: fetcher
>    Affects Versions: 1.3
>         Environment: Any
>            Reporter: Radim Kolar
>            Assignee: Markus Jelsma
>              Labels: encoding, url
>             Fix For: 1.5
>
>         Attachments: patch-with-utf8-encoding.diff
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> Basic URL normalizer lacks 2 important features
> Encode space in URL into %20 to unbreak httpclient and possibly others who do 
> not expect space inside URL
> Ability to decode %33 encoding in URL. This is important for avoiding 
> duplicates

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to