[GitHub] any23 issue #34: ANY23-304 Add extractor for OpenIE

2017-08-23 Thread lewismc
Github user lewismc commented on the issue:

https://github.com/apache/any23/pull/34
  
Hi @ansell , in my last commit I've pushed a coupe of (hopefully) 
satisfying additions, namely
 * removal of open module from CLI (meaning that, by default the open 
extractor is not executed by default during normal unit test execution)
 * addition of some class loading logic which improves the flexibility of 
extractor detection based upon the presence of the open extractor.

By default now, open tests are not executed by default... this will 
dramatically reduce 1) the time of tests, and 2) he memory required to execute 
the tests.

Thanks for any final review.
Lewis


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] any23 issue #34: ANY23-304 Add extractor for OpenIE

2017-08-05 Thread lewismc
Github user lewismc commented on the issue:

https://github.com/apache/any23/pull/34
  
Hi @ansell yes this is a separate module however currently it always builds 
with CLI module. I'm going to push an update which disables the module tests by 
default.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] any23 issue #34: ANY23-304 Add extractor for OpenIE

2017-08-01 Thread ansell
Github user ansell commented on the issue:

https://github.com/apache/any23/pull/34
  
Is it an optional plugin in the current setup to avoid having users need to 
load it if they have minimal memory available. I haven't had time to look 
through it, but I see there is a new openie module.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] any23 issue #34: ANY23-304 Add extractor for OpenIE

2017-08-01 Thread ansell
Github user ansell commented on the issue:

https://github.com/apache/any23/pull/34
  
My main objections before were about the larger memory requirements for 
default use and not being able to run the tests without OOM in my mid-range 
development machine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] any23 issue #34: ANY23-304 Add extractor for OpenIE

2017-08-01 Thread lewismc
Github user lewismc commented on the issue:

https://github.com/apache/any23/pull/34
  
Will commit within next day or so if there are no objections.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] any23 issue #34: ANY23-304 Add extractor for OpenIE

2017-07-27 Thread lewismc
Github user lewismc commented on the issue:

https://github.com/apache/any23/pull/34
  
Hi @ansell I finally got around to addressing your comments. Just to 
refresh your memory, use of FileOutputStream (as oppose to 
ByteArrayOutputStream) within the OpenExtractorTest.java logic is more 
performant, by around 1/4 second or so. 
Do you have any further comments on this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] any23 issue #34: ANY23-304 Add extractor for OpenIE

2017-03-01 Thread ansell
Github user ansell commented on the issue:

https://github.com/apache/any23/pull/34
  
Tests failed for me with OOM:

```
[INFO] Compiling 1 source file to 
/home/mint/gitrepos/any23/openie/target/test-classes
[INFO] 
[INFO] --- maven-surefire-plugin:2.19.1:test (default-test) @ 
apache-any23-openie ---

---
 T E S T S
---
Running org.apache.any23.openie.OpenIEExtractorTest
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/home/mint/.m2/repository/ch/qos/logback/logback-classic/1.1.2/logback-classic-1.1.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/mint/.m2/repository/org/slf4j/slf4j-log4j12/1.7.21/slf4j-log4j12-1.7.21.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
explanation.
SLF4J: Actual binding is of type 
[ch.qos.logback.classic.util.ContextSelectorStaticBinder]
Loading feature templates.
Loading models.
Loading lexica.
Loading configuration.
Loading feature templates.
Loading models.
Loading feature templates.
Loading models.
Loading lexica.
Loading feature templates.
Loading models.
Loading feature templates.
Loading models.
Loading lexica.
Loading feature templates.
Loading models.
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 20.977 sec 
<<< FAILURE! - in org.apache.any23.openie.OpenIEExtractorTest
testExtractFromHTMLDocument(org.apache.any23.openie.OpenIEExtractorTest)  
Time elapsed: 20.282 sec  <<< ERROR!
java.lang.OutOfMemoryError: Java heap space
at 
org.apache.any23.openie.OpenIEExtractorTest.extract(OpenIEExtractorTest.java:75)
at 
org.apache.any23.openie.OpenIEExtractorTest.testExtractFromHTMLDocument(OpenIEExtractorTest.java:65)

```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] any23 issue #34: ANY23-304 Add extractor for OpenIE

2017-03-01 Thread lewismc
Github user lewismc commented on the issue:

https://github.com/apache/any23/pull/34
  
PING... anyone that is able to provide a review? Would be very much 
appreciated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] any23 issue #34: ANY23-304 Add extractor for OpenIE

2017-02-27 Thread lewismc
Github user lewismc commented on the issue:

https://github.com/apache/any23/pull/34
  
Unfortunately... due to the bugs regarding the ```META-INF/service``` 
directories being filtered out, it means that the plugins for Any23 2.0 are not 
as useful as they should be as they cannot be dynamically discovered if present 
on the classpath. We should potentially push Any23 2.1 once this patch is 
merged into master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] any23 issue #34: ANY23-304 Add extractor for OpenIE

2017-02-27 Thread lewismc
Github user lewismc commented on the issue:

https://github.com/apache/any23/pull/34
  
Hi @ansell this is now fixed... if you could pull the code and let me know 
how you get on it would be appreciated.
After a good bit of debugging I discovered that some erroneous 
`` descriptions in plugin pom.xml files meant that the 
```META-INF/service``` directories were being filtered out from the generated 
.jar artifacts... meaning that the ServiceLoader did not discover them.
Anyway... if you could pull the code and let me know how you get on it 
would be appreciated. This is working well for me.
One final thing to note, you will see that for the appassembler plugin 
definition in ```cli/pom.xml``` we increase the JVM arguments to 6000m... this 
is because OpenIE is pretty memory intensive.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] any23 issue #34: ANY23-304 Add extractor for OpenIE

2017-02-23 Thread lewismc
Github user lewismc commented on the issue:

https://github.com/apache/any23/pull/34
  
Yep your right. Bang on the money. I'll update the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] any23 issue #34: ANY23-304 Add extractor for OpenIE

2017-02-23 Thread ansell
Github user ansell commented on the issue:

https://github.com/apache/any23/pull/34
  
The cli module may need the new module added as a dependency to pull it 
onto the classpath. Strangely enough, it appears as though none of the other 
plugins are cli dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] any23 issue #34: ANY23-304 Add extractor for OpenIE

2017-02-23 Thread lewismc
Github user lewismc commented on the issue:

https://github.com/apache/any23/pull/34
  
OK so implementing ExtractorPlugin is not necessary... none of the other 
plugins use this logic.
I'm trying to get it working via cli appassembler script however no joy yet.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] any23 issue #34: ANY23-304 Add extractor for OpenIE

2017-02-23 Thread ansell
Github user ansell commented on the issue:

https://github.com/apache/any23/pull/34
  
I haven't looked at it recently. The META-INF/services should be enough on 
their own without the explicit plugin support but I can't recall whether there 
are any other differences that could affect usage.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] any23 issue #34: ANY23-304 Add extractor for OpenIE

2017-02-23 Thread lewismc
Github user lewismc commented on the issue:

https://github.com/apache/any23/pull/34
  
@ansell is it necessary to put this new module into ```plugins``` and have 
the new extractor implement 
[ExtractorPlugin](http://any23.apache.org/apidocs/index.html?org/apache/any23/plugin/ExtractorPlugin.html)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---