[jira] [Resolved] (OPENNLP-1550) Update CSS styles of the Manual
[ https://issues.apache.org/jira/browse/OPENNLP-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita resolved OPENNLP-1550. - Fix Version/s: 2.3.3 Resolution: Fixed > Update CSS styles of the Manual > --- > > Key: OPENNLP-1550 > URL: https://issues.apache.org/jira/browse/OPENNLP-1550 > Project: OpenNLP > Issue Type: Improvement >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita >Priority: Minor > Fix For: 2.3.3 > > > Just to have less of an 90's look :) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (OPENNLP-1550) Update CSS styles of the Manual
[ https://issues.apache.org/jira/browse/OPENNLP-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-1550: Summary: Update CSS styles of the Manual (was: Update styles of the Manual) > Update CSS styles of the Manual > --- > > Key: OPENNLP-1550 > URL: https://issues.apache.org/jira/browse/OPENNLP-1550 > Project: OpenNLP > Issue Type: Improvement >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita >Priority: Minor > > Just to have less of an 90's look :) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (OPENNLP-1550) Update styles of the Manual
[ https://issues.apache.org/jira/browse/OPENNLP-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita reassigned OPENNLP-1550: --- Assignee: Bruno P. Kinoshita > Update styles of the Manual > --- > > Key: OPENNLP-1550 > URL: https://issues.apache.org/jira/browse/OPENNLP-1550 > Project: OpenNLP > Issue Type: Improvement >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita >Priority: Minor > > Just to have less of an 90's look :) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (OPENNLP-1550) Update styles of the Manual
Bruno P. Kinoshita created OPENNLP-1550: --- Summary: Update styles of the Manual Key: OPENNLP-1550 URL: https://issues.apache.org/jira/browse/OPENNLP-1550 Project: OpenNLP Issue Type: Improvement Reporter: Bruno P. Kinoshita Just to have less of an 90's look :) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (OPENNLP-1531) Add Portuguese abbreviation dictionary
[ https://issues.apache.org/jira/browse/OPENNLP-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita resolved OPENNLP-1531. - Resolution: Fixed > Add Portuguese abbreviation dictionary > -- > > Key: OPENNLP-1531 > URL: https://issues.apache.org/jira/browse/OPENNLP-1531 > Project: OpenNLP > Issue Type: Improvement >Affects Versions: 2.3.1 >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita >Priority: Minor > Fix For: 2.3.2 > > > Similar to the addition in OPENNLP-570 and OPENNLP-1526, an abbreviation > dictionary for Italian sentence detection and tokenisation might be > beneficial. > Aims: > - Create and add a new file {{abb_PT.xml}} to _opennlp-tools/lang/pt_ > - Add basic set of test cases > Other: > - Confirm if European/Brazilian/African/Creole Portuguese have the same > abbreviations or if we need different languages... -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (OPENNLP-1531) Add Portuguese abbreviation dictionary
[ https://issues.apache.org/jira/browse/OPENNLP-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17801478#comment-17801478 ] Bruno P. Kinoshita commented on OPENNLP-1531: - To generate the list of abbreviations from the Academia Brasileira de Letras: {code:java} // https://www.academia.org.br/nossa-lingua/reducoes var tokens = [...document.getElementsByTagName("p")].reduce((acc, p) => { const text = p.innerText if (text !== undefined && text !== null && text.trim() !== "") { const tokens = text.split(" ") if (tokens.length >= 2) { const possibleAbbreviation = tokens[0].trim().toLowerCase() if (possibleAbbreviation.includes('.') && !acc.includes(possibleAbbreviation)) { acc.push(possibleAbbreviation) } } } return acc }, [])// Create OpenNLP abb_PT.xml file var xml = [``] tokens.forEach(token => { xml.push(` ${token} `) }) xml.push('')// Copy to clipboard copy(xml.join('\n')){code} Running that in Firefox will parse the HTML, create the XML body, and copy it to the system clipboard. > Add Portuguese abbreviation dictionary > -- > > Key: OPENNLP-1531 > URL: https://issues.apache.org/jira/browse/OPENNLP-1531 > Project: OpenNLP > Issue Type: Improvement >Affects Versions: 2.3.1 >Reporter: Bruno P. Kinoshita >Priority: Minor > > Similar to the addition in OPENNLP-570 and OPENNLP-1526, an abbreviation > dictionary for Italian sentence detection and tokenisation might be > beneficial. > Aims: > - Create and add a new file {{abb_PT.xml}} to _opennlp-tools/lang/pt_ > - Add basic set of test cases > Other: > - Confirm if European/Brazilian/African/Creole Portuguese have the same > abbreviations or if we need different languages... -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (OPENNLP-1531) Add Portuguese abbreviation dictionary
[ https://issues.apache.org/jira/browse/OPENNLP-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17801477#comment-17801477 ] Bruno P. Kinoshita commented on OPENNLP-1531: - The list of abbreviations seem to be the same between European and Brazilian Portuguese. Can't say if the African or Creole Portuguese follow the same list, but will leave that for a follow-up issue. > Add Portuguese abbreviation dictionary > -- > > Key: OPENNLP-1531 > URL: https://issues.apache.org/jira/browse/OPENNLP-1531 > Project: OpenNLP > Issue Type: Improvement >Affects Versions: 2.3.1 >Reporter: Bruno P. Kinoshita >Priority: Minor > > Similar to the addition in OPENNLP-570 and OPENNLP-1526, an abbreviation > dictionary for Italian sentence detection and tokenisation might be > beneficial. > Aims: > - Create and add a new file {{abb_PT.xml}} to _opennlp-tools/lang/pt_ > - Add basic set of test cases > Other: > - Confirm if European/Brazilian/African/Creole Portuguese have the same > abbreviations or if we need different languages... -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (OPENNLP-1531) Add Portuguese abbreviation dictionary
Bruno P. Kinoshita created OPENNLP-1531: --- Summary: Add Portuguese abbreviation dictionary Key: OPENNLP-1531 URL: https://issues.apache.org/jira/browse/OPENNLP-1531 Project: OpenNLP Issue Type: Improvement Affects Versions: 2.3.1 Reporter: Bruno P. Kinoshita Fix For: 2.3.2 Similar to the addition in OPENNLP-570 and OPENNLP-1526, an abbreviation dictionary for Italian sentence detection and tokenisation might be beneficial. Aims: - Create and add a new file {{abb_PT.xml}} to _opennlp-tools/lang/pt_ - Add basic set of test cases Other: - Confirm if European/Brazilian/African/Creole Portuguese have the same abbreviations or if we need different languages... -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (OPENNLP-1528) Review Catalan regexp for the ela germinada
[ https://issues.apache.org/jira/browse/OPENNLP-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-1528: Description: I shared on Twitter about the issue with the word "ós" found in our tokenizer tests, and Joan Montané (unjoanqualsevol on Twitter) replied pointing that our regexp for Catalan didn't seem right. Created this issue so we can test & fix it. >Regexp is not fully correct. Catalan written language uses middle dot / >interpunct (U+00B7) as inner word character: cel·la, goril·la, instal·lar, >cancel·lar,... !image-2023-12-11-15-20-31-518.png|width=365,height=429! was: I shared on Twitter about the issue with the word "ós" found in our tokenizer tests, and Joan Montané (unjoanqualsevol on Twitter) replied pointing that our regexp for Catalan didn't seem right. Created this issue so we can test & fix it. !image-2023-12-11-15-20-31-518.png|width=365,height=429! > Review Catalan regexp for the ela germinada > --- > > Key: OPENNLP-1528 > URL: https://issues.apache.org/jira/browse/OPENNLP-1528 > Project: OpenNLP > Issue Type: Bug >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita >Priority: Minor > Attachments: image-2023-12-11-15-20-31-518.png > > > I shared on Twitter about the issue with the word "ós" found in our tokenizer > tests, and Joan Montané (unjoanqualsevol on Twitter) replied pointing that > our regexp for Catalan didn't seem right. > Created this issue so we can test & fix it. > >Regexp is not fully correct. Catalan written language uses middle dot / > >interpunct (U+00B7) as inner word character: cel·la, goril·la, instal·lar, > >cancel·lar,... > !image-2023-12-11-15-20-31-518.png|width=365,height=429! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (OPENNLP-1528) Review Catalan regexp for the ela germinada
[ https://issues.apache.org/jira/browse/OPENNLP-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-1528: Description: I shared on Twitter about the issue with the word "ós" found in our tokenizer tests, and Joan Montané (unjoanqualsevol on Twitter) replied pointing that our regexp for Catalan didn't seem right. Created this issue so we can test & fix it. > Regexp is not fully correct. Catalan written language uses middle dot / >interpunct (U+00B7) as inner word character: cel·la, goril·la, instal·lar, >cancel·lar,... !image-2023-12-11-15-20-31-518.png|width=365,height=429! was: I shared on Twitter about the issue with the word "ós" found in our tokenizer tests, and Joan Montané (unjoanqualsevol on Twitter) replied pointing that our regexp for Catalan didn't seem right. Created this issue so we can test & fix it. {noformat} Regexp is not fully correct. Catalan written language uses middle dot / interpunct (U+00B7) as inner word character: cel·la, goril·la, instal·lar, cancel·lar,... {noformat} !image-2023-12-11-15-20-31-518.png|width=365,height=429! > Review Catalan regexp for the ela germinada > --- > > Key: OPENNLP-1528 > URL: https://issues.apache.org/jira/browse/OPENNLP-1528 > Project: OpenNLP > Issue Type: Bug >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita >Priority: Minor > Attachments: image-2023-12-11-15-20-31-518.png > > > I shared on Twitter about the issue with the word "ós" found in our tokenizer > tests, and Joan Montané (unjoanqualsevol on Twitter) replied pointing that > our regexp for Catalan didn't seem right. > Created this issue so we can test & fix it. > > Regexp is not fully correct. Catalan written language uses middle dot / > >interpunct (U+00B7) as inner word character: cel·la, goril·la, instal·lar, > >cancel·lar,... > !image-2023-12-11-15-20-31-518.png|width=365,height=429! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (OPENNLP-1528) Review Catalan regexp for the ela germinada
[ https://issues.apache.org/jira/browse/OPENNLP-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-1528: Description: I shared on Twitter about the issue with the word "ós" found in our tokenizer tests, and Joan Montané (unjoanqualsevol on Twitter) replied pointing that our regexp for Catalan didn't seem right. Created this issue so we can test & fix it. {noformat} Regexp is not fully correct. Catalan written language uses middle dot / interpunct (U+00B7) as inner word character: cel·la, goril·la, instal·lar, cancel·lar,... {noformat} !image-2023-12-11-15-20-31-518.png|width=365,height=429! was: I shared on Twitter about the issue with the word "ós" found in our tokenizer tests, and Joan Montané (unjoanqualsevol on Twitter) replied pointing that our regexp for Catalan didn't seem right. Created this issue so we can test & fix it. >Regexp is not fully correct. Catalan written language uses middle dot / >interpunct (U+00B7) as inner word character: cel·la, goril·la, instal·lar, >cancel·lar,... !image-2023-12-11-15-20-31-518.png|width=365,height=429! > Review Catalan regexp for the ela germinada > --- > > Key: OPENNLP-1528 > URL: https://issues.apache.org/jira/browse/OPENNLP-1528 > Project: OpenNLP > Issue Type: Bug >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita >Priority: Minor > Attachments: image-2023-12-11-15-20-31-518.png > > > I shared on Twitter about the issue with the word "ós" found in our tokenizer > tests, and Joan Montané (unjoanqualsevol on Twitter) replied pointing that > our regexp for Catalan didn't seem right. > Created this issue so we can test & fix it. > {noformat} > Regexp is not fully correct. Catalan written language uses middle dot / > interpunct (U+00B7) as inner word character: cel·la, goril·la, instal·lar, > cancel·lar,... {noformat} > > !image-2023-12-11-15-20-31-518.png|width=365,height=429! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (OPENNLP-1528) Review Catalan regexp for the ela germinada
Bruno P. Kinoshita created OPENNLP-1528: --- Summary: Review Catalan regexp for the ela germinada Key: OPENNLP-1528 URL: https://issues.apache.org/jira/browse/OPENNLP-1528 Project: OpenNLP Issue Type: Bug Reporter: Bruno P. Kinoshita Assignee: Bruno P. Kinoshita Attachments: image-2023-12-11-15-20-31-518.png I shared on Twitter about the issue with the word "ós" found in our tokenizer tests, and Joan Montané (unjoanqualsevol on Twitter) replied pointing that our regexp for Catalan didn't seem right. Created this issue so we can test & fix it. !image-2023-12-11-15-20-31-518.png|width=365,height=429! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (OPENNLP-1480) Quick fix for HTML href javadoc
[ https://issues.apache.org/jira/browse/OPENNLP-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita resolved OPENNLP-1480. - Resolution: Fixed > Quick fix for HTML href javadoc > --- > > Key: OPENNLP-1480 > URL: https://issues.apache.org/jira/browse/OPENNLP-1480 > Project: OpenNLP > Issue Type: Bug > Components: Documentation >Affects Versions: 2.1.1 >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita >Priority: Trivial > Fix For: 2.1.2 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (OPENNLP-1480) Quick fix for HTML href javadoc
Bruno P. Kinoshita created OPENNLP-1480: --- Summary: Quick fix for HTML href javadoc Key: OPENNLP-1480 URL: https://issues.apache.org/jira/browse/OPENNLP-1480 Project: OpenNLP Issue Type: Bug Components: Documentation Affects Versions: 2.1.1 Reporter: Bruno P. Kinoshita Assignee: Bruno P. Kinoshita Fix For: 2.1.2 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (OPENNLP-1479) Write better tests for pattern verification (tokenizers)
Bruno P. Kinoshita created OPENNLP-1479: --- Summary: Write better tests for pattern verification (tokenizers) Key: OPENNLP-1479 URL: https://issues.apache.org/jira/browse/OPENNLP-1479 Project: OpenNLP Issue Type: Improvement Components: Tokenizer Affects Versions: 2.1.1 Reporter: Bruno P. Kinoshita Fix For: 2.1.2 >From [https://github.com/apache/opennlp/pull/516#issuecomment-1455015772] At the moment our tests verify that the tokenizer objects are created correctly (i.e. tests getters and setters, constructor, etc.), without verifying the actual behavior when used in conjunction with other classes (factory, tokenizer, trainers, etc). It would be best to test the patterns used in the factories for different languages with some interesting sample data (maybe something from project gutenberg, open source news sites, etc.). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (OPENNLP-1473) Add .asf.yaml
[ https://issues.apache.org/jira/browse/OPENNLP-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita resolved OPENNLP-1473. - Resolution: Fixed > Add .asf.yaml > - > > Key: OPENNLP-1473 > URL: https://issues.apache.org/jira/browse/OPENNLP-1473 > Project: OpenNLP > Issue Type: Task > Components: Documentation >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita >Priority: Trivial > Fix For: 2.1.2 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (OPENNLP-1474) Create tokenizer factories for other langs (Spanish, Italian, ...)
Bruno P. Kinoshita created OPENNLP-1474: --- Summary: Create tokenizer factories for other langs (Spanish, Italian, ...) Key: OPENNLP-1474 URL: https://issues.apache.org/jira/browse/OPENNLP-1474 Project: OpenNLP Issue Type: Improvement Components: Tokenizer Affects Versions: 2.1.1 Reporter: Bruno P. Kinoshita Fix For: 2.2.0 >From [https://github.com/apache/opennlp/pull/506#issuecomment-1445849746] We can create more factories for languages such as Spanish and Italian. For example: {noformat} // From: https://it.wikipedia.org/wiki/Alfabeto_italiano private static final Pattern ITALIAN = Pattern.compile("^[0-9a-zàèéìîíòóùüA-ZÀÈÉÌÎÍÒÓÙÜ]+$"); // From: https://en.wikiversity.org/wiki/Alphabet/Spanish_alphabet & https://en.wikipedia.org/wiki/Spanish_orthography#Alphabet_in_Spanish & https://www.fundeu.es/consulta/tilde-en-la-y-y-griega-o-ye-24786/ private static final Pattern SPANISH = Pattern.compile("^[0-9a-záéíóúüýñA-ZÁÉÍÓÚÝÑ]+$"); {noformat} Community feedback would be appreciated. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (OPENNLP-1473) Add .asf.yaml
Bruno P. Kinoshita created OPENNLP-1473: --- Summary: Add .asf.yaml Key: OPENNLP-1473 URL: https://issues.apache.org/jira/browse/OPENNLP-1473 Project: OpenNLP Issue Type: Task Components: Documentation Reporter: Bruno P. Kinoshita Assignee: Bruno P. Kinoshita Fix For: 2.1.2 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (OPENNLP-985) A condition that is always true.
[ https://issues.apache.org/jira/browse/OPENNLP-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-985: --- Fix Version/s: 2.1.1 > A condition that is always true. > > > Key: OPENNLP-985 > URL: https://issues.apache.org/jira/browse/OPENNLP-985 > Project: OpenNLP > Issue Type: Bug > Components: wsd >Reporter: JC >Assignee: Bruno P. Kinoshita >Priority: Trivial > Fix For: 2.1.1 > > > I've found a code smell or typo in a recent github snapshot. (opennlp-snadbox) > Path: > opennlp-wsd/src/main/java/opennlp/tools/disambiguator/datareader/Paragraph.java > {code:java} > 85 public boolean contains(String wordTag) { > 86 > 87 for (Sentence isentence : this.getSsentences()) { > 88 for (Word iword : isentence.getIwords()) { > 89 if (iword.equals(iword)) > 90 return true; > 91 } > 92 } > 93 > 94 return false; > 95 } > {code} > Line 89 is always true. This might be a trivial issue but wanted to report > just in case. Thanks! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (OPENNLP-985) A condition that is always true.
[ https://issues.apache.org/jira/browse/OPENNLP-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita resolved OPENNLP-985. Resolution: Fixed > A condition that is always true. > > > Key: OPENNLP-985 > URL: https://issues.apache.org/jira/browse/OPENNLP-985 > Project: OpenNLP > Issue Type: Bug > Components: wsd >Reporter: JC >Assignee: Bruno P. Kinoshita >Priority: Trivial > Fix For: 2.1.1 > > > I've found a code smell or typo in a recent github snapshot. (opennlp-snadbox) > Path: > opennlp-wsd/src/main/java/opennlp/tools/disambiguator/datareader/Paragraph.java > {code:java} > 85 public boolean contains(String wordTag) { > 86 > 87 for (Sentence isentence : this.getSsentences()) { > 88 for (Word iword : isentence.getIwords()) { > 89 if (iword.equals(iword)) > 90 return true; > 91 } > 92 } > 93 > 94 return false; > 95 } > {code} > Line 89 is always true. This might be a trivial issue but wanted to report > just in case. Thanks! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (OPENNLP-985) A condition that is always true.
[ https://issues.apache.org/jira/browse/OPENNLP-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita reassigned OPENNLP-985: -- Assignee: Bruno P. Kinoshita > A condition that is always true. > > > Key: OPENNLP-985 > URL: https://issues.apache.org/jira/browse/OPENNLP-985 > Project: OpenNLP > Issue Type: Bug > Components: wsd >Reporter: JC >Assignee: Bruno P. Kinoshita >Priority: Trivial > > I've found a code smell or typo in a recent github snapshot. (opennlp-snadbox) > Path: > opennlp-wsd/src/main/java/opennlp/tools/disambiguator/datareader/Paragraph.java > {code:java} > 85 public boolean contains(String wordTag) { > 86 > 87 for (Sentence isentence : this.getSsentences()) { > 88 for (Word iword : isentence.getIwords()) { > 89 if (iword.equals(iword)) > 90 return true; > 91 } > 92 } > 93 > 94 return false; > 95 } > {code} > Line 89 is always true. This might be a trivial issue but wanted to report > just in case. Thanks! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (OPENNLP-1447) Move from System.out/System.err to SLF4J
[ https://issues.apache.org/jira/browse/OPENNLP-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17686534#comment-17686534 ] Bruno P. Kinoshita edited comment on OPENNLP-1447 at 2/9/23 2:06 PM: - For tests that rely on capturing the process output, maybe something like what's described on these SO answers could work? [https://stackoverflow.com/questions/29076981/how-to-intercept-slf4j-with-logback-logging-via-a-junit-test] However, I think we still need to review if these tests have to be kept as they are (i.e. writing to some output stream) or if they should be re-written at an Java object input/output level, instead of I/O. was (Author: kinow): For tests that rely on capturing the process output, maybe something like what's described on these SO answers could work? https://stackoverflow.com/questions/29076981/how-to-intercept-slf4j-with-logback-logging-via-a-junit-test > Move from System.out/System.err to SLF4J > > > Key: OPENNLP-1447 > URL: https://issues.apache.org/jira/browse/OPENNLP-1447 > Project: OpenNLP > Issue Type: Epic >Reporter: Richard Zowalla >Priority: Major > > As discussed on the mailing list, we are in favour of moving from System.out > / System.err to proper log output. > The discussion is here: > https://lists.apache.org/thread/vt748qbz5onhwhh70kky9wk1o5zm42tm > To reduce reviewer burden, we should tackle the task in several steps. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (OPENNLP-1447) Move from System.out/System.err to SLF4J
[ https://issues.apache.org/jira/browse/OPENNLP-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17686534#comment-17686534 ] Bruno P. Kinoshita commented on OPENNLP-1447: - For tests that rely on capturing the process output, maybe something like what's described on these SO answers could work? https://stackoverflow.com/questions/29076981/how-to-intercept-slf4j-with-logback-logging-via-a-junit-test > Move from System.out/System.err to SLF4J > > > Key: OPENNLP-1447 > URL: https://issues.apache.org/jira/browse/OPENNLP-1447 > Project: OpenNLP > Issue Type: Epic >Reporter: Richard Zowalla >Priority: Major > > As discussed on the mailing list, we are in favour of moving from System.out > / System.err to proper log output. > The discussion is here: > https://lists.apache.org/thread/vt748qbz5onhwhh70kky9wk1o5zm42tm > To reduce reviewer burden, we should tackle the task in several steps. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (OPENNLP-1403) Enhance JavaDoc in opennlp.tools.langdetect and opennlp.tools.languagemodel packages
[ https://issues.apache.org/jira/browse/OPENNLP-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita resolved OPENNLP-1403. - Assignee: Bruno P. Kinoshita Resolution: Fixed > Enhance JavaDoc in opennlp.tools.langdetect and opennlp.tools.languagemodel > packages > > > Key: OPENNLP-1403 > URL: https://issues.apache.org/jira/browse/OPENNLP-1403 > Project: OpenNLP > Issue Type: Improvement > Components: Documentation >Affects Versions: 2.1.0 >Reporter: Martin Wiesner >Assignee: Bruno P. Kinoshita >Priority: Minor > Fix For: 2.1.1 > > > The JavaDoc of the _opennlp.tools.langdetect_ and > _opennlp.tools.languagemodel_ packages suffer from several inconsistencies > and missing descriptions. Moreover, several typos are present that need > sanitizing. > It needs enhancements and/or additions to provide more clarity for readers of > that part of the OpenNLP API. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (OPENNLP-1401) Enhance JavaDoc in opennlp.tools.chunker package
[ https://issues.apache.org/jira/browse/OPENNLP-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita resolved OPENNLP-1401. - Resolution: Fixed > Enhance JavaDoc in opennlp.tools.chunker package > > > Key: OPENNLP-1401 > URL: https://issues.apache.org/jira/browse/OPENNLP-1401 > Project: OpenNLP > Issue Type: Improvement > Components: Chunker, Documentation >Affects Versions: 2.1.0 >Reporter: Martin Wiesner >Assignee: Bruno P. Kinoshita >Priority: Minor > Fix For: 2.1.1 > > > The JavaDoc the opennlp.tools.chunker package suffers from several > inconsistencies and missing descriptions. Moreover, several typos are present > that need sanitizing. > It needs enhancements and/or additions to provide more clarity for readers of > that part of the OpenNLP API. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (OPENNLP-1399) Integrate ASF Matomo into OpenNLP website
[ https://issues.apache.org/jira/browse/OPENNLP-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-1399: Affects Version/s: 2.1.0 > Integrate ASF Matomo into OpenNLP website > - > > Key: OPENNLP-1399 > URL: https://issues.apache.org/jira/browse/OPENNLP-1399 > Project: OpenNLP > Issue Type: Task > Components: Website >Affects Versions: 2.1.0 >Reporter: Richard Zowalla >Assignee: Richard Zowalla >Priority: Minor > Fix For: 2.1.1 > > > as the title says -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (OPENNLP-1399) Integrate ASF Matomo into OpenNLP website
[ https://issues.apache.org/jira/browse/OPENNLP-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita resolved OPENNLP-1399. - Fix Version/s: 2.1.1 Resolution: Fixed > Integrate ASF Matomo into OpenNLP website > - > > Key: OPENNLP-1399 > URL: https://issues.apache.org/jira/browse/OPENNLP-1399 > Project: OpenNLP > Issue Type: Task > Components: Website >Reporter: Richard Zowalla >Assignee: Richard Zowalla >Priority: Minor > Fix For: 2.1.1 > > > as the title says -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (OPENNLP-1400) Enhance JavaDoc in opennlp.tools.sentdetect package
[ https://issues.apache.org/jira/browse/OPENNLP-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-1400: Affects Version/s: 2.1.0 > Enhance JavaDoc in opennlp.tools.sentdetect package > --- > > Key: OPENNLP-1400 > URL: https://issues.apache.org/jira/browse/OPENNLP-1400 > Project: OpenNLP > Issue Type: Improvement > Components: Documentation >Affects Versions: 2.1.0 >Reporter: Martin Wiesner >Assignee: Bruno P. Kinoshita >Priority: Minor > Fix For: 2.1.1 > > > The JavaDoc the _opennlp.tools.sentdetect_ package suffers from several > inconsistencies and missing descriptions. Moreover, several typos are present > that need sanitizing. > It needs enhancements and/or additions to provide more clarity. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (OPENNLP-1400) Enhance JavaDoc in opennlp.tools.sentdetect package
[ https://issues.apache.org/jira/browse/OPENNLP-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita resolved OPENNLP-1400. - Resolution: Fixed Thanks for the PR and for setting the fix version! > Enhance JavaDoc in opennlp.tools.sentdetect package > --- > > Key: OPENNLP-1400 > URL: https://issues.apache.org/jira/browse/OPENNLP-1400 > Project: OpenNLP > Issue Type: Improvement > Components: Documentation >Reporter: Martin Wiesner >Assignee: Bruno P. Kinoshita >Priority: Minor > Fix For: 2.1.1 > > > The JavaDoc the _opennlp.tools.sentdetect_ package suffers from several > inconsistencies and missing descriptions. Moreover, several typos are present > that need sanitizing. > It needs enhancements and/or additions to provide more clarity. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (OPENNLP-1400) Enhance JavaDoc in opennlp.tools.sentdetect package
[ https://issues.apache.org/jira/browse/OPENNLP-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita reassigned OPENNLP-1400: --- Assignee: Bruno P. Kinoshita > Enhance JavaDoc in opennlp.tools.sentdetect package > --- > > Key: OPENNLP-1400 > URL: https://issues.apache.org/jira/browse/OPENNLP-1400 > Project: OpenNLP > Issue Type: Improvement > Components: Documentation >Reporter: Martin Wiesner >Assignee: Bruno P. Kinoshita >Priority: Minor > Fix For: 2.1.1 > > > The JavaDoc the _opennlp.tools.sentdetect_ package suffers from several > inconsistencies and missing descriptions. Moreover, several typos are present > that need sanitizing. > It needs enhancements and/or additions to provide more clarity. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (OPENNLP-689) ParserEvaluator test case
[ https://issues.apache.org/jira/browse/OPENNLP-689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita resolved OPENNLP-689. Resolution: Fixed > ParserEvaluator test case > - > > Key: OPENNLP-689 > URL: https://issues.apache.org/jira/browse/OPENNLP-689 > Project: OpenNLP > Issue Type: Test > Components: Parser >Affects Versions: tools-1.5.3 >Reporter: Rodrigo Agerri >Assignee: Bruno P. Kinoshita >Priority: Minor > Fix For: 2.1.1 > > > Add a test case for the ParserEvaluator (follow up of OPENNLP-31). Use the > content of the main method in that class. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (OPENNLP-689) ParserEvaluator test case
[ https://issues.apache.org/jira/browse/OPENNLP-689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-689: --- Fix Version/s: 2.1.1 > ParserEvaluator test case > - > > Key: OPENNLP-689 > URL: https://issues.apache.org/jira/browse/OPENNLP-689 > Project: OpenNLP > Issue Type: Test > Components: Parser >Affects Versions: tools-1.5.3 >Reporter: Rodrigo Agerri >Assignee: Bruno P. Kinoshita >Priority: Minor > Fix For: 2.1.1 > > > Add a test case for the ParserEvaluator (follow up of OPENNLP-31). Use the > content of the main method in that class. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (OPENNLP-1332) Default chunker context generator has inconsistent format
[ https://issues.apache.org/jira/browse/OPENNLP-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita resolved OPENNLP-1332. - Resolution: Fixed > Default chunker context generator has inconsistent format > - > > Key: OPENNLP-1332 > URL: https://issues.apache.org/jira/browse/OPENNLP-1332 > Project: OpenNLP > Issue Type: Improvement >Reporter: Stephen Foster >Assignee: Bruno P. Kinoshita >Priority: Minor > Fix For: 2.1.1 > > > In DefaultChunkerContextGenerator, when p_2 is assigned on line 56, it > follows a different format from every other assignment: > {code:java} > p_2 = "p_2" + preds[i - 2]; > {code} > The "p_2" is missing an equals sign - to be consistent it should read: > {code:java} > p_2 = "p_2=" + preds[i - 2]; > {code} > Apologies if this is a known issue - I did a brief search but didn't see > anything. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (OPENNLP-1341) Make the ConlluPOSSampleStream constructor public
[ https://issues.apache.org/jira/browse/OPENNLP-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita resolved OPENNLP-1341. - Assignee: Bruno P. Kinoshita Resolution: Fixed > Make the ConlluPOSSampleStream constructor public > - > > Key: OPENNLP-1341 > URL: https://issues.apache.org/jira/browse/OPENNLP-1341 > Project: OpenNLP > Issue Type: Wish > Components: POS Tagger >Affects Versions: 1.9.3 > Environment: Windows >Reporter: Reece H. Dunn >Assignee: Bruno P. Kinoshita >Priority: Trivial > Labels: easyfix > Fix For: 2.1.1 > > > The constructor for ConlluPOSSampleStream is currently package private as it > does not have a public/private keyword before it. Making it public will allow > an application to create that stream for use in training a POS model. > The other conllu stream classes (ConlluStream, ConlluLemmaSampleStream, > ConlluTokenSampleStream, ConlluSentenceSampleStream) have public > constructors, allowing them to be used in training their respective models. > The ConlluPOSSampleStreamFactory and ConlluLemmaSampleStreamFactory classes > are marked as internal use only, and rely on an API suited to the opennlp > cli. Note: ConlluSentenceSampleStreamFactory and > ConlluTokenSampleStreamFactory are missing an internal use only comment, but > should have one. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (OPENNLP-1332) Default chunker context generator has inconsistent format
[ https://issues.apache.org/jira/browse/OPENNLP-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita reassigned OPENNLP-1332: --- Assignee: Bruno P. Kinoshita > Default chunker context generator has inconsistent format > - > > Key: OPENNLP-1332 > URL: https://issues.apache.org/jira/browse/OPENNLP-1332 > Project: OpenNLP > Issue Type: Improvement >Reporter: Stephen Foster >Assignee: Bruno P. Kinoshita >Priority: Minor > Fix For: 2.1.1 > > > In DefaultChunkerContextGenerator, when p_2 is assigned on line 56, it > follows a different format from every other assignment: > {code:java} > p_2 = "p_2" + preds[i - 2]; > {code} > The "p_2" is missing an equals sign - to be consistent it should read: > {code:java} > p_2 = "p_2=" + preds[i - 2]; > {code} > Apologies if this is a known issue - I did a brief search but didn't see > anything. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (OPENNLP-1332) Default chunker context generator has inconsistent format
[ https://issues.apache.org/jira/browse/OPENNLP-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-1332: Fix Version/s: 2.1.1 > Default chunker context generator has inconsistent format > - > > Key: OPENNLP-1332 > URL: https://issues.apache.org/jira/browse/OPENNLP-1332 > Project: OpenNLP > Issue Type: Improvement >Reporter: Stephen Foster >Priority: Minor > Fix For: 2.1.1 > > > In DefaultChunkerContextGenerator, when p_2 is assigned on line 56, it > follows a different format from every other assignment: > {code:java} > p_2 = "p_2" + preds[i - 2]; > {code} > The "p_2" is missing an equals sign - to be consistent it should read: > {code:java} > p_2 = "p_2=" + preds[i - 2]; > {code} > Apologies if this is a known issue - I did a brief search but didn't see > anything. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (OPENNLP-1341) Make the ConlluPOSSampleStream constructor public
[ https://issues.apache.org/jira/browse/OPENNLP-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-1341: Fix Version/s: 2.1.1 > Make the ConlluPOSSampleStream constructor public > - > > Key: OPENNLP-1341 > URL: https://issues.apache.org/jira/browse/OPENNLP-1341 > Project: OpenNLP > Issue Type: Wish > Components: POS Tagger >Affects Versions: 1.9.3 > Environment: Windows >Reporter: Reece H. Dunn >Priority: Trivial > Labels: easyfix > Fix For: 2.1.1 > > > The constructor for ConlluPOSSampleStream is currently package private as it > does not have a public/private keyword before it. Making it public will allow > an application to create that stream for use in training a POS model. > The other conllu stream classes (ConlluStream, ConlluLemmaSampleStream, > ConlluTokenSampleStream, ConlluSentenceSampleStream) have public > constructors, allowing them to be used in training their respective models. > The ConlluPOSSampleStreamFactory and ConlluLemmaSampleStreamFactory classes > are marked as internal use only, and rely on an API suited to the opennlp > cli. Note: ConlluSentenceSampleStreamFactory and > ConlluTokenSampleStreamFactory are missing an internal use only comment, but > should have one. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (OPENNLP-1352) Migrate from Travis CI to GH Actions?
[ https://issues.apache.org/jira/browse/OPENNLP-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita resolved OPENNLP-1352. - Fix Version/s: 1.9.4 Resolution: Fixed > Migrate from Travis CI to GH Actions? > - > > Key: OPENNLP-1352 > URL: https://issues.apache.org/jira/browse/OPENNLP-1352 > Project: OpenNLP > Issue Type: Improvement > Components: Build, Packaging and Test >Affects Versions: 1.9.3 >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita >Priority: Trivial > Fix For: 1.9.4 > > > Does anyone have any opinion on this? Travis CI seems to be quite slow now, > both to start the tests, and also to run them. > Migrating to GH Actions is now trivial. Other ASF projects have moved to > Actions (Commons) or are using ASF infra (Jena). -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (OPENNLP-1352) Migrate from Travis CI to GH Actions?
[ https://issues.apache.org/jira/browse/OPENNLP-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita reassigned OPENNLP-1352: --- Assignee: Bruno P. Kinoshita > Migrate from Travis CI to GH Actions? > - > > Key: OPENNLP-1352 > URL: https://issues.apache.org/jira/browse/OPENNLP-1352 > Project: OpenNLP > Issue Type: Improvement > Components: Build, Packaging and Test >Affects Versions: 1.9.3 >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita >Priority: Trivial > > Does anyone have any opinion on this? Travis CI seems to be quite slow now, > both to start the tests, and also to run them. > Migrating to GH Actions is now trivial. Other ASF projects have moved to > Actions (Commons) or are using ASF infra (Jena). -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (OPENNLP-1350) MAIL_REGEX in UrlCharSequenceNormalizer causes quadratic complexity for certain input, and is also a bit imprecise
[ https://issues.apache.org/jira/browse/OPENNLP-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita resolved OPENNLP-1350. - Resolution: Fixed > MAIL_REGEX in UrlCharSequenceNormalizer causes quadratic complexity for > certain input, and is also a bit imprecise > -- > > Key: OPENNLP-1350 > URL: https://issues.apache.org/jira/browse/OPENNLP-1350 > Project: OpenNLP > Issue Type: Bug > Components: Language Detector >Affects Versions: 1.9.3 >Reporter: Jon Marius Venstad >Assignee: Bruno P. Kinoshita >Priority: Minor > Fix For: 1.9.4 > > > The regex used to strip email addresses from input, in > UrlCharSequenceNormalizer, has quadratic complexity when used with > {{{}String.replaceAll{}}}, and when input is a long sequence of characters > from the first character set, i.e., {{{}[-_.0-9A-Za-z]{}}}, which fails to > match the whole regex; then, the regex is evaluated again for each suffix of > this sequence, with linear cost each time. > This problem is promptly solved by adding a negative lookbehind with a single > character from that same set, to the first part of the regex. > > Additionally, the character {{_}} is allowed in the domain part of the mail > address, where it is in fact illegal. Likewise, the character {{+}} is > disallowed in the local part (the first first), where it _is{_} legal, and > even quite common. The set of legal characters in the first part is actually > quite bonkers, per the RFC, but such usage is probably less common. See > [https://en.wikipedia.org/wiki/Email_address] for details. > > The suggested fix is to change the {{MAIL_REGEX}} declaration to > {code:java} > private static final Pattern MAIL_REGEX = > > Pattern.compile("(? {code} > For a sequence of ~100k characters, the run time is ~1minute "on my machine". > With this change, it reduces to a few milliseconds. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (OPENNLP-1352) Migrate from Travis CI to GH Actions?
Bruno P. Kinoshita created OPENNLP-1352: --- Summary: Migrate from Travis CI to GH Actions? Key: OPENNLP-1352 URL: https://issues.apache.org/jira/browse/OPENNLP-1352 Project: OpenNLP Issue Type: Improvement Components: Build, Packaging and Test Affects Versions: 1.9.3 Reporter: Bruno P. Kinoshita Does anyone have any opinion on this? Travis CI seems to be quite slow now, both to start the tests, and also to run them. Migrating to GH Actions is now trivial. Other ASF projects have moved to Actions (Commons) or are using ASF infra (Jena). -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (OPENNLP-1350) MAIL_REGEX in UrlCharSequenceNormalizer causes quadratic complexity for certain input, and is also a bit imprecise
[ https://issues.apache.org/jira/browse/OPENNLP-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-1350: Fix Version/s: 1.9.4 > MAIL_REGEX in UrlCharSequenceNormalizer causes quadratic complexity for > certain input, and is also a bit imprecise > -- > > Key: OPENNLP-1350 > URL: https://issues.apache.org/jira/browse/OPENNLP-1350 > Project: OpenNLP > Issue Type: Bug > Components: Language Detector >Affects Versions: 1.9.3 >Reporter: Jon Marius Venstad >Assignee: Bruno P. Kinoshita >Priority: Minor > Fix For: 1.9.4 > > > The regex used to strip email addresses from input, in > UrlCharSequenceNormalizer, has quadratic complexity when used with > {{{}String.replaceAll{}}}, and when input is a long sequence of characters > from the first character set, i.e., {{{}[-_.0-9A-Za-z]{}}}, which fails to > match the whole regex; then, the regex is evaluated again for each suffix of > this sequence, with linear cost each time. > This problem is promptly solved by adding a negative lookbehind with a single > character from that same set, to the first part of the regex. > > Additionally, the character {{_}} is allowed in the domain part of the mail > address, where it is in fact illegal. Likewise, the character {{+}} is > disallowed in the local part (the first first), where it _is{_} legal, and > even quite common. The set of legal characters in the first part is actually > quite bonkers, per the RFC, but such usage is probably less common. See > [https://en.wikipedia.org/wiki/Email_address] for details. > > The suggested fix is to change the {{MAIL_REGEX}} declaration to > {code:java} > private static final Pattern MAIL_REGEX = > > Pattern.compile("(? {code} > For a sequence of ~100k characters, the run time is ~1minute "on my machine". > With this change, it reduces to a few milliseconds. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (OPENNLP-1350) MAIL_REGEX in UrlCharSequenceNormalizer causes quadratic complexity for certain input, and is also a bit imprecise
[ https://issues.apache.org/jira/browse/OPENNLP-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita reassigned OPENNLP-1350: --- Assignee: Bruno P. Kinoshita > MAIL_REGEX in UrlCharSequenceNormalizer causes quadratic complexity for > certain input, and is also a bit imprecise > -- > > Key: OPENNLP-1350 > URL: https://issues.apache.org/jira/browse/OPENNLP-1350 > Project: OpenNLP > Issue Type: Bug > Components: Language Detector >Affects Versions: 1.9.3 >Reporter: Jon Marius Venstad >Assignee: Bruno P. Kinoshita >Priority: Minor > > The regex used to strip email addresses from input, in > UrlCharSequenceNormalizer, has quadratic complexity when used with > {{{}String.replaceAll{}}}, and when input is a long sequence of characters > from the first character set, i.e., {{{}[-_.0-9A-Za-z]{}}}, which fails to > match the whole regex; then, the regex is evaluated again for each suffix of > this sequence, with linear cost each time. > This problem is promptly solved by adding a negative lookbehind with a single > character from that same set, to the first part of the regex. > > Additionally, the character {{_}} is allowed in the domain part of the mail > address, where it is in fact illegal. Likewise, the character {{+}} is > disallowed in the local part (the first first), where it _is{_} legal, and > even quite common. The set of legal characters in the first part is actually > quite bonkers, per the RFC, but such usage is probably less common. See > [https://en.wikipedia.org/wiki/Email_address] for details. > > The suggested fix is to change the {{MAIL_REGEX}} declaration to > {code:java} > private static final Pattern MAIL_REGEX = > > Pattern.compile("(? {code} > For a sequence of ~100k characters, the run time is ~1minute "on my machine". > With this change, it reduces to a few milliseconds. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (OPENNLP-1327) Add site .asf.yaml file
[ https://issues.apache.org/jira/browse/OPENNLP-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-1327: Fix Version/s: 1.9.4 > Add site .asf.yaml file > --- > > Key: OPENNLP-1327 > URL: https://issues.apache.org/jira/browse/OPENNLP-1327 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Affects Versions: 1.9.3 >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita >Priority: Minor > Fix For: 1.9.4 > > > Placeholder for https://github.com/apache/opennlp-site/pull/64 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (OPENNLP-1327) Add site .asf.yaml file
[ https://issues.apache.org/jira/browse/OPENNLP-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita resolved OPENNLP-1327. - Resolution: Fixed > Add site .asf.yaml file > --- > > Key: OPENNLP-1327 > URL: https://issues.apache.org/jira/browse/OPENNLP-1327 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Affects Versions: 1.9.3 >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita >Priority: Minor > > Placeholder for https://github.com/apache/opennlp-site/pull/64 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (OPENNLP-1327) Add site .asf.yaml file
Bruno P. Kinoshita created OPENNLP-1327: --- Summary: Add site .asf.yaml file Key: OPENNLP-1327 URL: https://issues.apache.org/jira/browse/OPENNLP-1327 Project: OpenNLP Issue Type: Improvement Components: Website Affects Versions: 1.9.3 Reporter: Bruno P. Kinoshita Assignee: Bruno P. Kinoshita Placeholder for https://github.com/apache/opennlp-site/pull/64 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (OPENNLP-1127) Fix readme HTML file generated for distribution archives
Bruno P. Kinoshita created OPENNLP-1127: --- Summary: Fix readme HTML file generated for distribution archives Key: OPENNLP-1127 URL: https://issues.apache.org/jira/browse/OPENNLP-1127 Project: OpenNLP Issue Type: Bug Components: Build, Packaging and Test Affects Versions: 1.8.1 Reporter: Bruno P. Kinoshita Assignee: Bruno P. Kinoshita Priority: Trivial Fix For: 1.8.3 The current README.md file, in the project root directory, is used by the opennlp-distr module. The readme file is included in distribution files. There are a few changes in the master branch that were not released yet. Running `mvn clean install` will create the distribution files, and inside you should find a README.html created based on the README.md file, plus other files. The Markdown to HTML generation is being done through https://github.com/walokra/markdown-page-generator-plugin. This issue is for enhancements in the README file and also around the markdown-page-generator-plugin use. As 1.8.2 release is in progress, this may be included in 1.8.3. Cheers Bruno -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (OPENNLP-1013) [OpenNLP][R Language][1.5.3-2] Bug when using French models
[ https://issues.apache.org/jira/browse/OPENNLP-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079528#comment-16079528 ] Bruno P. Kinoshita commented on OPENNLP-1013: - Not a problem. The rJava is just the bridge. First I would check with the [CRAN opennlp R package|https://cran.r-project.org/package=openNLP]. They will be able to thoroughly analyse this issue, and point whether there is a) an internal change that is necessary, b) something wrong with rJava, or c) a bug in the Java OpenNLP code. > [OpenNLP][R Language][1.5.3-2] Bug when using French models > --- > > Key: OPENNLP-1013 > URL: https://issues.apache.org/jira/browse/OPENNLP-1013 > Project: OpenNLP > Issue Type: Bug > Components: POS Tagger >Affects Versions: tools-1.5.3 > Environment: R Language, RStudio >Reporter: Iuri Deolindo Nogueira > Fix For: 1.8.2 > > > When using French models in R language, I'm receving a "subscript out of > bound" issue. I'm going to detail: > - > Well, I'm using French models to NLP in R environment. To get the french > models, I'm using binaries compiled and develloped by Nicolas: > https://sites.google.com/site/nicolashernandez/resources/opennlp > http://enicolashernandez.blogspot.fr/2012/12/apache-opennlp-fr-models.html > https://drive.google.com/drive/folders/0B4AyWQriFkxgWHR6QzlvcmxmdE0 > - > The problem it happens only with the POS function. This is how I call the > function and respective issue: > Maxent_POS_Tag_Annotator(language = "fr", probs = TRUE, model = > paste0(, "fr-pos.bin")) > Issue: > Error in environment(f)$meta[[tag]] : subscript out of bounds > - > However, if I deleted the language parameter, the issue does not happen > anymore: > Maxent_POS_Tag_Annotator(probs = TRUE, model = > paste0(, "fr-pos.bin")) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (OPENNLP-1013) [OpenNLP][R Language][1.5.3-2] Bug when using French models
[ https://issues.apache.org/jira/browse/OPENNLP-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079455#comment-16079455 ] Bruno P. Kinoshita commented on OPENNLP-1013: - Not fixed [~iurinog]. I added a comment, and the fix version was bumped to the next release. Though, again, I believe there is no easy way to troubleshoot and maybe fix the issue on our part. Better report an issue for the R package, and understand what's happening in the R code first. Maybe the Java API changed and now someone needs to update the R code, or maybe the language option wasn't really much used/tested before in the R code? > [OpenNLP][R Language][1.5.3-2] Bug when using French models > --- > > Key: OPENNLP-1013 > URL: https://issues.apache.org/jira/browse/OPENNLP-1013 > Project: OpenNLP > Issue Type: Bug > Components: POS Tagger >Affects Versions: tools-1.5.3 > Environment: R Language, RStudio >Reporter: Iuri Deolindo Nogueira > Fix For: 1.8.2 > > > When using French models in R language, I'm receving a "subscript out of > bound" issue. I'm going to detail: > - > Well, I'm using French models to NLP in R environment. To get the french > models, I'm using binaries compiled and develloped by Nicolas: > https://sites.google.com/site/nicolashernandez/resources/opennlp > http://enicolashernandez.blogspot.fr/2012/12/apache-opennlp-fr-models.html > https://drive.google.com/drive/folders/0B4AyWQriFkxgWHR6QzlvcmxmdE0 > - > The problem it happens only with the POS function. This is how I call the > function and respective issue: > Maxent_POS_Tag_Annotator(language = "fr", probs = TRUE, model = > paste0(, "fr-pos.bin")) > Issue: > Error in environment(f)$meta[[tag]] : subscript out of bounds > - > However, if I deleted the language parameter, the issue does not happen > anymore: > Maxent_POS_Tag_Annotator(probs = TRUE, model = > paste0(, "fr-pos.bin")) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (OPENNLP-1013) [OpenNLP][R Language][1.5.3-2] Bug when using French models
[ https://issues.apache.org/jira/browse/OPENNLP-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16078014#comment-16078014 ] Bruno P. Kinoshita commented on OPENNLP-1013: - I believe this issue could be in the R module, and not in OpenNLP. Looking at the code around [this part|https://github.com/cran/openNLP/blob/a1709dea5f8a92757fcfa5bf672aa922041dc119/R/pos.R#L54], it appears for English (default language value) we have the right fields in the meta var. But when you give it a different language, it is trying to load the models in a different way. I am not sure what could be the problem exactly, but it looks to be in the R code, of that package, not in OpenNLP code. > [OpenNLP][R Language][1.5.3-2] Bug when using French models > --- > > Key: OPENNLP-1013 > URL: https://issues.apache.org/jira/browse/OPENNLP-1013 > Project: OpenNLP > Issue Type: Bug > Components: POS Tagger >Affects Versions: tools-1.5.3 > Environment: R Language, RStudio >Reporter: Iuri Deolindo Nogueira > > When using French models in R language, I'm receving a "subscript out of > bound" issue. I'm going to detail: > - > Well, I'm using French models to NLP in R environment. To get the french > models, I'm using binaries compiled and develloped by Nicolas: > https://sites.google.com/site/nicolashernandez/resources/opennlp > http://enicolashernandez.blogspot.fr/2012/12/apache-opennlp-fr-models.html > https://drive.google.com/drive/folders/0B4AyWQriFkxgWHR6QzlvcmxmdE0 > - > The problem it happens only with the POS function. This is how I call the > function and respective issue: > Maxent_POS_Tag_Annotator(language = "fr", probs = TRUE, model = > paste0(, "fr-pos.bin")) > Issue: > Error in environment(f)$meta[[tag]] : subscript out of bounds > - > However, if I deleted the language parameter, the issue does not happen > anymore: > Maxent_POS_Tag_Annotator(probs = TRUE, model = > paste0(, "fr-pos.bin")) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (OPENNLP-1104) Fix images at the bottom of the Powered By page, and use lower cases for link
[ https://issues.apache.org/jira/browse/OPENNLP-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-1104: Attachment: after.png before.png > Fix images at the bottom of the Powered By page, and use lower cases for link > - > > Key: OPENNLP-1104 > URL: https://issues.apache.org/jira/browse/OPENNLP-1104 > Project: OpenNLP > Issue Type: Documentation > Components: Website >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita >Priority: Minor > Labels: website > Attachments: after.png, before.png > > > The current Powered By page is the only page with upper case letter. Besides > keeping things concise, there are cases where using lower case URL's may be > helpful for SEO (though that's not so relevant for our project I think). > The images at the bottom also are not being displayed. I didn't know, but > looks like in ASciiDoc you *must* include the [] 's, even if empty. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (OPENNLP-1104) Fix images at the bottom of the Powered By page, and use lower cases for link
Bruno P. Kinoshita created OPENNLP-1104: --- Summary: Fix images at the bottom of the Powered By page, and use lower cases for link Key: OPENNLP-1104 URL: https://issues.apache.org/jira/browse/OPENNLP-1104 Project: OpenNLP Issue Type: Documentation Components: Website Reporter: Bruno P. Kinoshita Assignee: Bruno P. Kinoshita Priority: Minor The current Powered By page is the only page with upper case letter. Besides keeping things concise, there are cases where using lower case URL's may be helpful for SEO (though that's not so relevant for our project I think). The images at the bottom also are not being displayed. I didn't know, but looks like in ASciiDoc you *must* include the [] 's, even if empty. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (OPENNLP-1103) Add AirNZ use case for OpenNLP to the web site
Bruno P. Kinoshita created OPENNLP-1103: --- Summary: Add AirNZ use case for OpenNLP to the web site Key: OPENNLP-1103 URL: https://issues.apache.org/jira/browse/OPENNLP-1103 Project: OpenNLP Issue Type: Documentation Components: Website Reporter: Bruno P. Kinoshita Assignee: Bruno P. Kinoshita Priority: Minor Went to the Wynyard Quarter innovation week some weeks ago, and saw that AirNZ was showing their bot and that it used OpenNLP. Spoke with Joey Faust, Product Manager, and got the following testimonial for our site. {noformat} Air New Zealand uses OpenNLP to power its chatbot, Oscar. Launched in February 2017, Oscar provides a conversational interface for customers to ask questions about flights, amenities and policies. Using OpenNLP, we've been able to consistently provide over 50% conversational success and support hundreds of intents. {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (OPENNLP-1098) Create a web page for 'Books-Tutorials-Talks'
[ https://issues.apache.org/jira/browse/OPENNLP-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062655#comment-16062655 ] Bruno P. Kinoshita commented on OPENNLP-1098: - If you would like to throw some initial links here, and perhaps some categories too? Books, Twitter accounts, Projects, Talks, etc? > Create a web page for 'Books-Tutorials-Talks' > -- > > Key: OPENNLP-1098 > URL: https://issues.apache.org/jira/browse/OPENNLP-1098 > Project: OpenNLP > Issue Type: New Feature > Components: Website >Reporter: Suneel Marthi >Assignee: Suneel Marthi > Fix For: 1.8.1 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (OPENNLP-1098) Create a web page for 'Books-Tutorials-Talks'
[ https://issues.apache.org/jira/browse/OPENNLP-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062653#comment-16062653 ] Bruno P. Kinoshita commented on OPENNLP-1098: - Went to JIRA to file this issue :-) thanks for doing that. Page linked in the chat for reference later: http://mahout.apache.org/general/books-tutorials-and-talks.html > Create a web page for 'Books-Tutorials-Talks' > -- > > Key: OPENNLP-1098 > URL: https://issues.apache.org/jira/browse/OPENNLP-1098 > Project: OpenNLP > Issue Type: New Feature > Components: Website >Reporter: Suneel Marthi >Assignee: Suneel Marthi > Fix For: 1.8.1 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (OPENNLP-1093) Update Maven JBake plug-in version and groupId
Bruno P. Kinoshita created OPENNLP-1093: --- Summary: Update Maven JBake plug-in version and groupId Key: OPENNLP-1093 URL: https://issues.apache.org/jira/browse/OPENNLP-1093 Project: OpenNLP Issue Type: Bug Components: Website Reporter: Bruno P. Kinoshita Assignee: Bruno P. Kinoshita Priority: Minor We found an issue that was fixed in master, but not released. When we asked about it, we learned that the plugin was being maintained elsewhere. We raised a question about how we could help getting the code to Maven central repository. It was published hours ago, so now we can start testing it. https://github.com/ingenieux/jbake-maven-plugin/issues/18 https://github.com/jbake-org/jbake-maven-plugin/issues/4 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (OPENNLP-1091) Fixing issues found via FindBugs and warnings found via IDE
Bruno P. Kinoshita created OPENNLP-1091: --- Summary: Fixing issues found via FindBugs and warnings found via IDE Key: OPENNLP-1091 URL: https://issues.apache.org/jira/browse/OPENNLP-1091 Project: OpenNLP Issue Type: Improvement Affects Versions: 1.8.0 Reporter: Bruno P. Kinoshita Assignee: Bruno P. Kinoshita Priority: Minor There are several issues that can be found using *FindBugs*. {noformat} mvn clean install findbugs:findbugs findbugs:gui {noformat} The _opennlp-tools_ is the only project with issues. Some are mere cosmetics, or not so important. The pull request mentioned in this issue does not fix all issues found, only the ones that I thought would be more important, and that would not have huge impact in the code (i.e. would not have to change much of the current behaviour/code base). Some changes are quite useful, such as optimizations that replace string concatenation and use _Map#entrySet_ instead of _Map#keySet_ + another call to _Map#get_. All the optimizations changes put together, I expect we should see at least a few milliseconds improvement. Other changes are quite important, such as comparisons with _Object.equals(anArray, anotherArray)_, which will compare two objects with _==_, meaning that even when the arrays are equals, it would still return false. In the pull request, I intentionally did not squash it now, as the second commit include warnings found via the IDE (Eclipse in this case, but I believe it's independent of the IDE). Such as _suppressWarnings_ that are not necessary, and - the most importants - resource leak. This latter issue was fixed with Java8 try-with-resources, mainly in tests, but also in some tools. Cheers Bruno -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (OPENNLP-1067) Use a variable to replace OpenNLP version in pages
[ https://issues.apache.org/jira/browse/OPENNLP-1067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita resolved OPENNLP-1067. - Resolution: Fixed > Use a variable to replace OpenNLP version in pages > -- > > Key: OPENNLP-1067 > URL: https://issues.apache.org/jira/browse/OPENNLP-1067 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita > > Currently, we have to update several pages after a release, in order to > update the site. > This could be automated with a Maven variable + some JBake-fu. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (OPENNLP-1067) Use a variable to replace OpenNLP version in pages
[ https://issues.apache.org/jira/browse/OPENNLP-1067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita reassigned OPENNLP-1067: --- Assignee: Bruno P. Kinoshita > Use a variable to replace OpenNLP version in pages > -- > > Key: OPENNLP-1067 > URL: https://issues.apache.org/jira/browse/OPENNLP-1067 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita > > Currently, we have to update several pages after a release, in order to > update the site. > This could be automated with a Maven variable + some JBake-fu. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OPENNLP-1045) Add documentation for development with Git (at ASF, GitHub, etc) for OpenNLP
[ https://issues.apache.org/jira/browse/OPENNLP-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013666#comment-16013666 ] Bruno P. Kinoshita commented on OPENNLP-1045: - Look at adding or possibly copying many parts of: https://mahout.apache.org/developers/github.html > Add documentation for development with Git (at ASF, GitHub, etc) for OpenNLP > > > Key: OPENNLP-1045 > URL: https://issues.apache.org/jira/browse/OPENNLP-1045 > Project: OpenNLP > Issue Type: Documentation > Components: Website >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita >Priority: Minor > Labels: development, documentation, git, website > Attachments: OPENNLP-1045-menu-20170516.png, > OPENNLP-1045-page-20170516-fullpage.png > > > We need to add documentation for developers, explaining the process to work > with Git in Apache OpenNLP. > Listing things like proper way to commit (e.g. include JIRA issue whenever > possible in the commit message), how to handle and merge pull requests (e.g. > empty commits, merge with fast-forward, etc), and so it goes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (OPENNLP-1045) Add documentation for development with Git (at ASF, GitHub, etc) for OpenNLP
[ https://issues.apache.org/jira/browse/OPENNLP-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-1045: Attachment: OPENNLP-1045-page-20170516-fullpage.png OPENNLP-1045-menu-20170516.png Attached drop down menu screen shot, where it is possible to see where the new entry will be, and how it will look like. Also attached the current draft. Will submit pull request soon, just will proofread in the morning after drinking some strong coffee (-: > Add documentation for development with Git (at ASF, GitHub, etc) for OpenNLP > > > Key: OPENNLP-1045 > URL: https://issues.apache.org/jira/browse/OPENNLP-1045 > Project: OpenNLP > Issue Type: Documentation > Components: Website >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita >Priority: Minor > Labels: development, documentation, git, website > Attachments: OPENNLP-1045-menu-20170516.png, > OPENNLP-1045-page-20170516-fullpage.png > > > We need to add documentation for developers, explaining the process to work > with Git in Apache OpenNLP. > Listing things like proper way to commit (e.g. include JIRA issue whenever > possible in the commit message), how to handle and merge pull requests (e.g. > empty commits, merge with fast-forward, etc), and so it goes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (OPENNLP-1045) Add documentation for development with Git (at ASF, GitHub, etc) for OpenNLP
[ https://issues.apache.org/jira/browse/OPENNLP-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita reassigned OPENNLP-1045: --- Assignee: Bruno P. Kinoshita > Add documentation for development with Git (at ASF, GitHub, etc) for OpenNLP > > > Key: OPENNLP-1045 > URL: https://issues.apache.org/jira/browse/OPENNLP-1045 > Project: OpenNLP > Issue Type: Documentation > Components: Website >Reporter: Bruno P. Kinoshita >Assignee: Bruno P. Kinoshita >Priority: Minor > Labels: development, documentation, git, website > > We need to add documentation for developers, explaining the process to work > with Git in Apache OpenNLP. > Listing things like proper way to commit (e.g. include JIRA issue whenever > possible in the commit message), how to handle and merge pull requests (e.g. > empty commits, merge with fast-forward, etc), and so it goes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (OPENNLP-1053) DOAP has moved
[ https://issues.apache.org/jira/browse/OPENNLP-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita resolved OPENNLP-1053. - Resolution: Fixed Fixed via pull request #6, and site deployment. DOAP file online now. > DOAP has moved > -- > > Key: OPENNLP-1053 > URL: https://issues.apache.org/jira/browse/OPENNLP-1053 > Project: OpenNLP > Issue Type: Bug >Reporter: Sebb >Assignee: Jeff Zemerick > > The DOAP used to be located at: > http://opennlp.apache.org/doap_opennlp.rdf > It has disappeared. > Please either replace it, or update the link here: > https://svn.apache.org/repos/asf/comdev/projects.apache.org/data/projects.xml -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OPENNLP-1045) Add documentation for development with Git (at ASF, GitHub, etc) for OpenNLP
[ https://issues.apache.org/jira/browse/OPENNLP-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006523#comment-16006523 ] Bruno P. Kinoshita commented on OPENNLP-1045: - Lesson from tonight's mess in opennlp-site :-) {noformat} Add a section for merging pull requests. Developers should be instructed to either set up (local/globally) fast-forward mode on. git config --global merge.ff only git config merge.ff only Or remember to use --ff-only when merging pull requests. Also, before submitting, developers can amend the commit message, adding the foot note "This closes #1234" where 1234 is the pull request number. In case the developer forgets to amend the commit, and he is the author of the pull request, he can still close in the GitHub UI. Alternatively, the developer should ask the user, and wait a while to close the pull request. The last option is to send an empty commit. {noformat} > Add documentation for development with Git (at ASF, GitHub, etc) for OpenNLP > > > Key: OPENNLP-1045 > URL: https://issues.apache.org/jira/browse/OPENNLP-1045 > Project: OpenNLP > Issue Type: Documentation > Components: Website >Reporter: Bruno P. Kinoshita >Priority: Minor > Labels: development, documentation, git, website > > We need to add documentation for developers, explaining the process to work > with Git in Apache OpenNLP. > Listing things like proper way to commit (e.g. include JIRA issue whenever > possible in the commit message), how to handle and merge pull requests (e.g. > empty commits, merge with fast-forward, etc), and so it goes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (OPENNLP-393) Add a contributions wanted page to our website
[ https://issues.apache.org/jira/browse/OPENNLP-393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita resolved OPENNLP-393. Resolution: Fixed > Add a contributions wanted page to our website > -- > > Key: OPENNLP-393 > URL: https://issues.apache.org/jira/browse/OPENNLP-393 > Project: OpenNLP > Issue Type: Task > Components: Website >Reporter: Joern Kottmann >Assignee: Bruno P. Kinoshita > Attachments: contributions-wanted-preview-1-fullpage.png > > > OpenNLP would like to get more contributions from the community to encourage > people to contribute more we should add a page which lists things which > should be done in OpenNLP. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (OPENNLP-504) Add a FAQ page to our site
[ https://issues.apache.org/jira/browse/OPENNLP-504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-504: --- Attachment: opennlp-faq-wip-20170512-fullpage.png > Add a FAQ page to our site > -- > > Key: OPENNLP-504 > URL: https://issues.apache.org/jira/browse/OPENNLP-504 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: James Kosin >Assignee: Bruno P. Kinoshita >Priority: Minor > Labels: FAQ, newbie > Attachments: opennlp-faq-wip-20170512-fullpage.png > > > Collect and assemble a FAQ page for our site. > Most questions start out: > Where can I get the models? > Where do I start getting to know OpenNLP? > etc. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (OPENNLP-504) Add a FAQ page to our site
[ https://issues.apache.org/jira/browse/OPENNLP-504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita reassigned OPENNLP-504: -- Assignee: Bruno P. Kinoshita > Add a FAQ page to our site > -- > > Key: OPENNLP-504 > URL: https://issues.apache.org/jira/browse/OPENNLP-504 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: James Kosin >Assignee: Bruno P. Kinoshita >Priority: Minor > Labels: FAQ, newbie > > Collect and assemble a FAQ page for our site. > Most questions start out: > Where can I get the models? > Where do I start getting to know OpenNLP? > etc. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (OPENNLP-393) Add a contributions wanted page to our website
[ https://issues.apache.org/jira/browse/OPENNLP-393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004408#comment-16004408 ] Bruno P. Kinoshita edited comment on OPENNLP-393 at 5/10/17 9:43 AM: - For the list of issues, I simply searched all issues marked with "help-wanted". Here's how to create a list and simply copy-paste to update that page. Will try to find a space in the Wiki for that. {code} #!/usr/bin/env python3 from jira import JIRA import json JIRA_URL='https://issues.apache.org/jira/' jira = JIRA(JIRA_URL) help_wanted_issues = jira.search_issues('project=OPENNLP AND resolution IS EMPTY and labels = help-wanted', maxResults=100) for issue in help_wanted_issues: issue_url = "%sbrowse/%s" % (JIRA_URL, issue.key) print("* %s[%s]: %s" % (issue_url, issue.key, issue.fields.summary)) {code} ps: besides having Python 3, you need to run `pip install jira` to get the jira module from here https://jira.readthedocs.io... one could too sort per createdDate, watchers, etc... was (Author: kinow): For the list of issues, I simply searched all issues marked with "help-wanted". Here's how to create a list and simply copy-paste to update that page. Will try to find a space in the Wiki for that. {code} #!/usr/bin/env python3 from jira import JIRA import json JIRA_URL='https://issues.apache.org/jira/' jira = JIRA(JIRA_URL) help_wanted_issues = jira.search_issues('project=OPENNLP AND resolution IS EMPTY and labels = help-wanted', maxResults=100) for issue in help_wanted_issues: issue_url = "%sbrowse/%s" % (JIRA_URL, issue.key) print("* %s[%s]: %s" % (issue_url, issue.key, issue.fields.summary)) {code} > Add a contributions wanted page to our website > -- > > Key: OPENNLP-393 > URL: https://issues.apache.org/jira/browse/OPENNLP-393 > Project: OpenNLP > Issue Type: Task > Components: Website >Reporter: Joern Kottmann >Assignee: Bruno P. Kinoshita > Attachments: contributions-wanted-preview-1-fullpage.png > > > OpenNLP would like to get more contributions from the community to encourage > people to contribute more we should add a page which lists things which > should be done in OpenNLP. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OPENNLP-393) Add a contributions wanted page to our website
[ https://issues.apache.org/jira/browse/OPENNLP-393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004408#comment-16004408 ] Bruno P. Kinoshita commented on OPENNLP-393: For the list of issues, I simply searched all issues marked with "help-wanted". Here's how to create a list and simply copy-paste to update that page. Will try to find a space in the Wiki for that. {code} #!/usr/bin/env python3 from jira import JIRA import json JIRA_URL='https://issues.apache.org/jira/' jira = JIRA(JIRA_URL) help_wanted_issues = jira.search_issues('project=OPENNLP AND resolution IS EMPTY and labels = help-wanted', maxResults=100) for issue in help_wanted_issues: issue_url = "%sbrowse/%s" % (JIRA_URL, issue.key) print("* %s[%s]: %s" % (issue_url, issue.key, issue.fields.summary)) {code} > Add a contributions wanted page to our website > -- > > Key: OPENNLP-393 > URL: https://issues.apache.org/jira/browse/OPENNLP-393 > Project: OpenNLP > Issue Type: Task > Components: Website >Reporter: Joern Kottmann >Assignee: Bruno P. Kinoshita > Attachments: contributions-wanted-preview-1-fullpage.png > > > OpenNLP would like to get more contributions from the community to encourage > people to contribute more we should add a page which lists things which > should be done in OpenNLP. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (OPENNLP-393) Add a contributions wanted page to our website
[ https://issues.apache.org/jira/browse/OPENNLP-393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-393: --- Attachment: contributions-wanted-preview-1-fullpage.png First preview of what it may look like. Preparing pull request in a few minutes. > Add a contributions wanted page to our website > -- > > Key: OPENNLP-393 > URL: https://issues.apache.org/jira/browse/OPENNLP-393 > Project: OpenNLP > Issue Type: Task > Components: Website >Reporter: Joern Kottmann >Assignee: Bruno P. Kinoshita > Attachments: contributions-wanted-preview-1-fullpage.png > > > OpenNLP would like to get more contributions from the community to encourage > people to contribute more we should add a page which lists things which > should be done in OpenNLP. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (OPENNLP-393) Add a contributions wanted page to our website
[ https://issues.apache.org/jira/browse/OPENNLP-393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita reassigned OPENNLP-393: -- Assignee: Bruno P. Kinoshita (was: Joern Kottmann) > Add a contributions wanted page to our website > -- > > Key: OPENNLP-393 > URL: https://issues.apache.org/jira/browse/OPENNLP-393 > Project: OpenNLP > Issue Type: Task > Components: Website >Reporter: Joern Kottmann >Assignee: Bruno P. Kinoshita > Attachments: contributions-wanted-preview-1-fullpage.png > > > OpenNLP would like to get more contributions from the community to encourage > people to contribute more we should add a page which lists things which > should be done in OpenNLP. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OPENNLP-393) Add a contributions wanted page to our website
[ https://issues.apache.org/jira/browse/OPENNLP-393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981033#comment-15981033 ] Bruno P. Kinoshita commented on OPENNLP-393: After discussion at Slack, we agreed that having the issues listed somewhere in the website would be more helpful to users. Listing issues dynamically from JIRA could be potentially slow (JIRA can be slow due to high load, or offline for maintenance). So the path now seems to be to get a page/space in Getting Involved, and manually list a few issues. Once OPENNLP-999 is merged, we can update the page. > Add a contributions wanted page to our website > -- > > Key: OPENNLP-393 > URL: https://issues.apache.org/jira/browse/OPENNLP-393 > Project: OpenNLP > Issue Type: Task > Components: Website >Reporter: Joern Kottmann >Assignee: Joern Kottmann > > OpenNLP would like to get more contributions from the community to encourage > people to contribute more we should add a page which lists things which > should be done in OpenNLP. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OPENNLP-1045) Add documentation for development with Git (at ASF, GitHub, etc) for OpenNLP
[ https://issues.apache.org/jira/browse/OPENNLP-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981014#comment-15981014 ] Bruno P. Kinoshita commented on OPENNLP-1045: - Example documentation page/wiki from other projects: * [Apache Zookeeper: Merging Github Pull Requests|https://cwiki.apache.org/confluence/display/ZOOKEEPER/Merging+Github+Pull+Requests] * [Apache Stratos: Merging Pull Requests|https://cwiki.apache.org/confluence/display/STRATOS/Merging+Pull+Requests] * [Apache Commons: Using GIT|https://wiki.apache.org/commons/UsingGIT] * [Apache Cordova: Processing Pull requests|https://github.com/apache/cordova-coho/blob/master/docs/processing-pull-requests.md] > Add documentation for development with Git (at ASF, GitHub, etc) for OpenNLP > > > Key: OPENNLP-1045 > URL: https://issues.apache.org/jira/browse/OPENNLP-1045 > Project: OpenNLP > Issue Type: Documentation > Components: Website >Reporter: Bruno P. Kinoshita >Priority: Minor > Labels: development, documentation, git, website > > We need to add documentation for developers, explaining the process to work > with Git in Apache OpenNLP. > Listing things like proper way to commit (e.g. include JIRA issue whenever > possible in the commit message), how to handle and merge pull requests (e.g. > empty commits, merge with fast-forward, etc), and so it goes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (OPENNLP-1045) Add documentation for development with Git (at ASF, GitHub, etc) for OpenNLP
Bruno P. Kinoshita created OPENNLP-1045: --- Summary: Add documentation for development with Git (at ASF, GitHub, etc) for OpenNLP Key: OPENNLP-1045 URL: https://issues.apache.org/jira/browse/OPENNLP-1045 Project: OpenNLP Issue Type: Documentation Components: Website Reporter: Bruno P. Kinoshita Priority: Minor We need to add documentation for developers, explaining the process to work with Git in Apache OpenNLP. Listing things like proper way to commit (e.g. include JIRA issue whenever possible in the commit message), how to handle and merge pull requests (e.g. empty commits, merge with fast-forward, etc), and so it goes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OPENNLP-393) Add a contributions wanted page to our website
[ https://issues.apache.org/jira/browse/OPENNLP-393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981012#comment-15981012 ] Bruno P. Kinoshita commented on OPENNLP-393: We could, perhaps: * Use the Help Wanted page from ASF - https://helpwanted.apache.org/ * Ask the maintainer of that app/site at ASF to add a way to link filtering by project (e.g. https://helpwanted.apache.org/?project=opennlp) * Add a link to the Help Wanted tasks for OpenNLP That way we would both show the contributions wanted for OpenNLP, and at the same time show users that we have a centralised repository for help-wanted issues. > Add a contributions wanted page to our website > -- > > Key: OPENNLP-393 > URL: https://issues.apache.org/jira/browse/OPENNLP-393 > Project: OpenNLP > Issue Type: Task > Components: Website >Reporter: Joern Kottmann >Assignee: Joern Kottmann > > OpenNLP would like to get more contributions from the community to encourage > people to contribute more we should add a page which lists things which > should be done in OpenNLP. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (OPENNLP-6) Create a new OpenNLP project logo for the site
[ https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-6: - Attachment: opennlp-logo_20170421.tar.gz Updated logo to include TM trademark. Also added some helper files for the favicons. > Create a new OpenNLP project logo for the site > -- > > Key: OPENNLP-6 > URL: https://issues.apache.org/jira/browse/OPENNLP-6 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Joern Kottmann >Assignee: Joern Kottmann >Priority: Minor > Labels: help-wanted > Fix For: 1.8.0 > > Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, > kinow-opennlp-3.png, kinow-opennlp-3-variations.png, koji-kinoshita-1.png, > koji-kinoshita-tomasso-02.png, koji-kinoshita-tomasso.svg, > koji-kinoshita-tomasso-updated.png, koji-kinoshita-tomasso-updated.svg, > koji-kinoshita-tomasso-variations.png, koji-kinoshita-tomasso-variations.svg, > koji-kinoshita-tommaso.png, koji-kinoshita-tommaso.png, opennlp_20170412.jpg, > OpenNLP-koji-1.png, OpenNLP-koji-1.png, OpenNLP-koji-2.png, > opennlp-logo_20170412.tar.gz, opennlp-logo_20170421.tar.gz, OpenNLP.png, > opennlp-variations.png, page-example.png --fullpage.png, text4363.png > > > The current logo was not changed for a long time (if ever). > Lets create a new fresh looking logo. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (OPENNLP-6) Create a new OpenNLP project logo for the site
[ https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-6: - Attachment: opennlp_20170412.jpg opennlp-logo_20170412.tar.gz > Create a new OpenNLP project logo for the site > -- > > Key: OPENNLP-6 > URL: https://issues.apache.org/jira/browse/OPENNLP-6 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Joern Kottmann >Assignee: Joern Kottmann >Priority: Minor > Labels: help-wanted > Fix For: 1.8.0 > > Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, > kinow-opennlp-3.png, kinow-opennlp-3-variations.png, koji-kinoshita-1.png, > koji-kinoshita-tomasso-02.png, koji-kinoshita-tomasso.svg, > koji-kinoshita-tomasso-updated.png, koji-kinoshita-tomasso-updated.svg, > koji-kinoshita-tomasso-variations.png, koji-kinoshita-tomasso-variations.svg, > koji-kinoshita-tommaso.png, koji-kinoshita-tommaso.png, opennlp_20170412.jpg, > OpenNLP-koji-1.png, OpenNLP-koji-1.png, OpenNLP-koji-2.png, > opennlp-logo_20170412.tar.gz, OpenNLP.png, opennlp-variations.png, > page-example.png --fullpage.png, text4363.png > > > The current logo was not changed for a long time (if ever). > Lets create a new fresh looking logo. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OPENNLP-6) Create a new OpenNLP project logo for the site
[ https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965570#comment-15965570 ] Bruno P. Kinoshita commented on OPENNLP-6: -- Uploading file opennlp-logo_20170412.tar.gz, and also opennlp_20170412.jpg. The ZIP file contains all SVG source files, as well as an exported PNG. Some different size and colour variations for use as logo, icon, favicon, etc. Based on the third option in the koji-kinoshita-tomasso-02.png file. > Create a new OpenNLP project logo for the site > -- > > Key: OPENNLP-6 > URL: https://issues.apache.org/jira/browse/OPENNLP-6 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Joern Kottmann >Assignee: Joern Kottmann >Priority: Minor > Labels: help-wanted > Fix For: 1.8.0 > > Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, > kinow-opennlp-3.png, kinow-opennlp-3-variations.png, koji-kinoshita-1.png, > koji-kinoshita-tomasso-02.png, koji-kinoshita-tomasso.svg, > koji-kinoshita-tomasso-updated.png, koji-kinoshita-tomasso-updated.svg, > koji-kinoshita-tomasso-variations.png, koji-kinoshita-tomasso-variations.svg, > koji-kinoshita-tommaso.png, koji-kinoshita-tommaso.png, OpenNLP-koji-1.png, > OpenNLP-koji-1.png, OpenNLP-koji-2.png, OpenNLP.png, opennlp-variations.png, > page-example.png --fullpage.png, text4363.png > > > The current logo was not changed for a long time (if ever). > Lets create a new fresh looking logo. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (OPENNLP-6) Create a new OpenNLP project logo for the site
[ https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-6: - Attachment: koji-kinoshita-tomasso-02.png Uploading a few variations, with * No gradient * Different colours used for the book and for the letters * Some font variations * Inverted book, so that the writer is right handed (for some reason it looked more natural that way for me) We can call these examples 1 - 6. So if you would like to see a combination of something from two or three variations, just name them that way :-) (i.e. get the colours from number #1, and the font from number #4, and so it goes) > Create a new OpenNLP project logo for the site > -- > > Key: OPENNLP-6 > URL: https://issues.apache.org/jira/browse/OPENNLP-6 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Joern Kottmann >Assignee: Joern Kottmann >Priority: Minor > Labels: help-wanted > Fix For: 1.8.0 > > Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, > kinow-opennlp-3.png, kinow-opennlp-3-variations.png, koji-kinoshita-1.png, > koji-kinoshita-tomasso-02.png, koji-kinoshita-tomasso.svg, > koji-kinoshita-tomasso-updated.png, koji-kinoshita-tomasso-updated.svg, > koji-kinoshita-tomasso-variations.png, koji-kinoshita-tomasso-variations.svg, > koji-kinoshita-tommaso.png, koji-kinoshita-tommaso.png, OpenNLP-koji-1.png, > OpenNLP-koji-1.png, OpenNLP-koji-2.png, OpenNLP.png, opennlp-variations.png, > page-example.png --fullpage.png, text4363.png > > > The current logo was not changed for a long time (if ever). > Lets create a new fresh looking logo. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (OPENNLP-999) RFE update web site layout
Bruno P. Kinoshita created OPENNLP-999: -- Summary: RFE update web site layout Key: OPENNLP-999 URL: https://issues.apache.org/jira/browse/OPENNLP-999 Project: OpenNLP Issue Type: Bug Components: Website Reporter: Bruno P. Kinoshita Priority: Minor Started a thread in the dev-mailing list recently, and it didn't get a negative feedback, so filing this issue as placeholder for discussion. OpenNLP web site has an old layout, while most new ASF projects are migrating to Jekyll, Maven Site + Fluid Skin, or other new solutions. http://mail-archives.apache.org/mod_mbox/opennlp-dev/201703.mbox/browser There is a current POC available at https://kinow.github.io/opennlp/, using Apache Maven Site plug-in and Fluid Skin. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (OPENNLP-6) Create a new OpenNLP project logo for the site
[ https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-6: - Attachment: text4363.png > Create a new OpenNLP project logo for the site > -- > > Key: OPENNLP-6 > URL: https://issues.apache.org/jira/browse/OPENNLP-6 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Joern Kottmann >Assignee: Joern Kottmann >Priority: Minor > Labels: help-wanted > Fix For: 1.8.0 > > Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, > kinow-opennlp-3.png, kinow-opennlp-3-variations.png, koji-kinoshita-1.png, > koji-kinoshita-tomasso.svg, koji-kinoshita-tomasso-updated.png, > koji-kinoshita-tomasso-updated.svg, koji-kinoshita-tomasso-variations.png, > koji-kinoshita-tomasso-variations.svg, koji-kinoshita-tommaso.png, > koji-kinoshita-tommaso.png, OpenNLP-koji-1.png, OpenNLP-koji-1.png, > OpenNLP-koji-2.png, OpenNLP.png, opennlp-variations.png, page-example.png > --fullpage.png, text4363.png > > > The current logo was not changed for a long time (if ever). > Lets create a new fresh looking logo. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (OPENNLP-6) Create a new OpenNLP project logo for the site
[ https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-6: - Attachment: page-example.png --fullpage.png Attaching a quick example of what the home page could look like with the new logo. > Create a new OpenNLP project logo for the site > -- > > Key: OPENNLP-6 > URL: https://issues.apache.org/jira/browse/OPENNLP-6 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Joern Kottmann >Priority: Minor > Labels: help-wanted > Fix For: 1.8.0 > > Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, > kinow-opennlp-3.png, kinow-opennlp-3-variations.png, koji-kinoshita-1.png, > koji-kinoshita-tomasso.svg, koji-kinoshita-tomasso-updated.png, > koji-kinoshita-tomasso-updated.svg, koji-kinoshita-tomasso-variations.png, > koji-kinoshita-tomasso-variations.svg, koji-kinoshita-tommaso.png, > koji-kinoshita-tommaso.png, OpenNLP-koji-1.png, OpenNLP-koji-1.png, > OpenNLP-koji-2.png, OpenNLP.png, opennlp-variations.png, page-example.png > --fullpage.png > > > The current logo was not changed for a long time (if ever). > Lets create a new fresh looking logo. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (OPENNLP-6) Create a new OpenNLP project logo for the site
[ https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-6: - Attachment: koji-kinoshita-tommaso.png koji-kinoshita-tomasso-variations.svg koji-kinoshita-tomasso-variations.png koji-kinoshita-tomasso-updated.svg koji-kinoshita-tomasso-updated.png koji-kinoshita-tomasso.svg > Create a new OpenNLP project logo for the site > -- > > Key: OPENNLP-6 > URL: https://issues.apache.org/jira/browse/OPENNLP-6 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Joern Kottmann >Priority: Minor > Labels: help-wanted > Fix For: 1.8.0 > > Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, > kinow-opennlp-3.png, kinow-opennlp-3-variations.png, koji-kinoshita-1.png, > koji-kinoshita-tomasso.svg, koji-kinoshita-tomasso-updated.png, > koji-kinoshita-tomasso-updated.svg, koji-kinoshita-tomasso-variations.png, > koji-kinoshita-tomasso-variations.svg, koji-kinoshita-tommaso.png, > koji-kinoshita-tommaso.png, OpenNLP-koji-1.png, OpenNLP-koji-1.png, > OpenNLP-koji-2.png, OpenNLP.png, opennlp-variations.png > > > The current logo was not changed for a long time (if ever). > Lets create a new fresh looking logo. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OPENNLP-6) Create a new OpenNLP project logo for the site
[ https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873409#comment-15873409 ] Bruno P. Kinoshita commented on OPENNLP-6: -- Thanks [~teofili]. Spent some time today vectorizing the book. Now in the attached SVG's, everything is a vector, and can be reused, scaled, transformed, etc. Applied the gradient as similar to yours as possible. * koji-kinoshita-tomasso.svg - contains the vectorized content, the other material used as reference. Useful in case some modification is necessary. * koji-kinoshita-tomasso-updated.svg - contains the updated logo. * koji-kinoshita-tomasso-variations.svg contains the updated logo and a few possible variations for icons. Attached PNG flies of the same, in case someone's browser doesn't handle SVG's properly. > Create a new OpenNLP project logo for the site > -- > > Key: OPENNLP-6 > URL: https://issues.apache.org/jira/browse/OPENNLP-6 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Joern Kottmann >Priority: Minor > Labels: help-wanted > Fix For: 1.8.0 > > Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, > kinow-opennlp-3.png, kinow-opennlp-3-variations.png, koji-kinoshita-1.png, > koji-kinoshita-tommaso.png, OpenNLP-koji-1.png, OpenNLP-koji-1.png, > OpenNLP-koji-2.png, OpenNLP.png, opennlp-variations.png > > > The current logo was not changed for a long time (if ever). > Lets create a new fresh looking logo. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (OPENNLP-6) Create a new OpenNLP project logo for the site
[ https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-6: - Attachment: koji-kinoshita-1.png > Create a new OpenNLP project logo for the site > -- > > Key: OPENNLP-6 > URL: https://issues.apache.org/jira/browse/OPENNLP-6 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Joern Kottmann >Priority: Minor > Labels: help-wanted > Fix For: 1.8.0 > > Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, > kinow-opennlp-3.png, kinow-opennlp-3-variations.png, koji-kinoshita-1.png, > OpenNLP-koji-1.png, OpenNLP-koji-1.png, OpenNLP-koji-2.png, OpenNLP.png, > opennlp-variations.png > > > The current logo was not changed for a long time (if ever). > Lets create a new fresh looking logo. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OPENNLP-6) Create a new OpenNLP project logo for the site
[ https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869483#comment-15869483 ] Bruno P. Kinoshita commented on OPENNLP-6: -- Just to check if I'm on the right direction... something like the attached logos in koji-kinoshita-1.png? > Create a new OpenNLP project logo for the site > -- > > Key: OPENNLP-6 > URL: https://issues.apache.org/jira/browse/OPENNLP-6 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Joern Kottmann >Priority: Minor > Labels: help-wanted > Fix For: 1.8.0 > > Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, > kinow-opennlp-3.png, kinow-opennlp-3-variations.png, OpenNLP-koji-1.png, > OpenNLP-koji-1.png, OpenNLP-koji-2.png, OpenNLP.png, opennlp-variations.png > > > The current logo was not changed for a long time (if ever). > Lets create a new fresh looking logo. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (OPENNLP-6) Create a new OpenNLP project logo for the site
[ https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-6: - Attachment: opennlp-variations.png > Create a new OpenNLP project logo for the site > -- > > Key: OPENNLP-6 > URL: https://issues.apache.org/jira/browse/OPENNLP-6 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Joern Kottmann >Priority: Minor > Labels: help-wanted > Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, > kinow-opennlp-3.png, kinow-opennlp-3-variations.png, OpenNLP.png, > opennlp-variations.png > > > The current logo was not changed for a long time (if ever). > Lets create a new fresh looking logo. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OPENNLP-6) Create a new OpenNLP project logo for the site
[ https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-6: - Attachment: kinow-opennlp-3-variations.png > Create a new OpenNLP project logo for the site > -- > > Key: OPENNLP-6 > URL: https://issues.apache.org/jira/browse/OPENNLP-6 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Joern Kottmann >Priority: Minor > Labels: help-wanted > Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, > kinow-opennlp-3.png, kinow-opennlp-3-variations.png, OpenNLP.png > > > The current logo was not changed for a long time (if ever). > Lets create a new fresh looking logo. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OPENNLP-6) Create a new OpenNLP project logo for the site
[ https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-6: - Attachment: kinow-opennlp-3.png > Create a new OpenNLP project logo for the site > -- > > Key: OPENNLP-6 > URL: https://issues.apache.org/jira/browse/OPENNLP-6 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Joern Kottmann >Priority: Minor > Labels: help-wanted > Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, > kinow-opennlp-3.png, kinow-opennlp-3-variations.png, OpenNLP.png > > > The current logo was not changed for a long time (if ever). > Lets create a new fresh looking logo. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OPENNLP-6) Create a new OpenNLP project logo for the site
[ https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823604#comment-15823604 ] Bruno P. Kinoshita commented on OPENNLP-6: -- My suggestion would to be to leave it optional, but someone involved with the project might be a better person to answer this. Thanks for looking into this Suneel, and thanks to Sally as well :-) > Create a new OpenNLP project logo for the site > -- > > Key: OPENNLP-6 > URL: https://issues.apache.org/jira/browse/OPENNLP-6 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Joern Kottmann >Priority: Minor > Labels: help-wanted > Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, OpenNLP.png > > > The current logo was not changed for a long time (if ever). > Lets create a new fresh looking logo. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OPENNLP-6) Create a new OpenNLP project logo for the site
[ https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823328#comment-15823328 ] Bruno P. Kinoshita commented on OPENNLP-6: -- Sure thing Joern http://mail-archives.apache.org/mod_mbox/opennlp-users/201701.mbox/browser I wonder who maintains our blogs? Maybe we could get word out through ASF official blog, and perhaps ASF twitter accounts? Both TheASF and apacheopennlp. Thanks! > Create a new OpenNLP project logo for the site > -- > > Key: OPENNLP-6 > URL: https://issues.apache.org/jira/browse/OPENNLP-6 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Joern Kottmann >Priority: Minor > Labels: help-wanted > Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, OpenNLP.png > > > The current logo was not changed for a long time (if ever). > Lets create a new fresh looking logo. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OPENNLP-6) Create a new OpenNLP project logo for the site
[ https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823070#comment-15823070 ] Bruno P. Kinoshita commented on OPENNLP-6: -- Attached two proposed logos. The first one is a simple combination of fonts, moving things around in Inkscape, then adjusting letter spacing manually. While the second has a more formal letter style, and a small image, that was created based on dependency trees. It is a simple dependency tree as follows: {noformat} /\ /\ / / \ {noformat} But with two copies moved slightly to its right with equal space in between them. The result looks like mountains, but if you pay attention you can see the dependency tree (I swear there is one :-) Single image with fonts and colours used available [here](http://kinow.deviantart.com/art/Proposed-logos-for-Apache-OpenNLP-657512914). SVG available as well. Made with Inkscape, fonts from Google Fonts. Cheers Bruno > Create a new OpenNLP project logo for the site > -- > > Key: OPENNLP-6 > URL: https://issues.apache.org/jira/browse/OPENNLP-6 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Joern Kottmann >Priority: Minor > Labels: help-wanted > Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, OpenNLP.png > > > The current logo was not changed for a long time (if ever). > Lets create a new fresh looking logo. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OPENNLP-6) Create a new OpenNLP project logo for the site
[ https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-6: - Attachment: kinow-opennlp-2.png > Create a new OpenNLP project logo for the site > -- > > Key: OPENNLP-6 > URL: https://issues.apache.org/jira/browse/OPENNLP-6 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Joern Kottmann >Priority: Minor > Labels: help-wanted > Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, OpenNLP.png > > > The current logo was not changed for a long time (if ever). > Lets create a new fresh looking logo. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OPENNLP-798) Division by Zero exception in the EvalParameters.
[ https://issues.apache.org/jira/browse/OPENNLP-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14924188#comment-14924188 ] Bruno P. Kinoshita commented on OPENNLP-798: When I had a look at the _EvalParameters(Context[] params, int numOutcomes)_ constructor, and the subsequent code called in _EvalParameters(Context[] params, double correctionParam, double correctionConstant, int numOutcomes)_, where you have: {noformat} this.constantInverse = 1.0 / correctionConstant {noformat} I thought it could rise a division by zero exception. However, upon a closer look, I realised it was a double division, which will result in Infinity. Joern, I think this issue can be closed as cannot reproduce, unless Gustavo has an exception trace or some sample code to reproduce it. > Division by Zero exception in the EvalParameters. > - > > Key: OPENNLP-798 > URL: https://issues.apache.org/jira/browse/OPENNLP-798 > Project: OpenNLP > Issue Type: Bug > Components: Machine Learning >Reporter: Gustavo Knuppe > > The EvalParameters(Context[] params, int numOutcomes) constructor rises a > division by zero exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OPENNLP-613) I want help to OpenNLP support PT-BR Language
[ https://issues.apache.org/jira/browse/OPENNLP-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14132577#comment-14132577 ] Bruno P. Kinoshita commented on OPENNLP-613: I believe Alexandre means Portuguese Brazilian I want help to OpenNLP support PT-BR Language - Key: OPENNLP-613 URL: https://issues.apache.org/jira/browse/OPENNLP-613 Project: OpenNLP Issue Type: Question Reporter: Alexandre Oliveira Labels: features, models, pre-trained, pt-br Hi, How I can help to create pre-trained models to PT-BR? Regards, Alexandre -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OPENNLP-591) [PATCH] Typos in Name Finder docs
[ https://issues.apache.org/jira/browse/OPENNLP-591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno P. Kinoshita updated OPENNLP-591: --- Summary: [PATCH] Typos in Name Finder docs (was: Typos in Name Finder docs) [PATCH] Typos in Name Finder docs - Key: OPENNLP-591 URL: https://issues.apache.org/jira/browse/OPENNLP-591 Project: OpenNLP Issue Type: Bug Reporter: Bruno P. Kinoshita Priority: Trivial Labels: typo Attachments: OPENNLP-591.patch Hello, I think there are few typos in the Name Finder documentation page. Since I'm not a native speaker it would be good an extra pair of eyes reviewing my patch. Thanks for the awesome tool. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira