My thanks Peter for the contribution I am outside the technical work but this seems like a move that can only increase the overall usefulness. I hope this will be considered seriously but the others here.
Gio On Fri, May 11, 2012 at 11:21 AM, Lewis John Mcgibbney <[email protected]> wrote: > Hi Peter, > > As I said on the issue at [1], this looks like exciting work. I'm > hoping that is sparks some conversation amongst us. > > As we are pushing for our first incubating release I'm not entirely > sure that the restructuring is a viable option just now, however we > should certainly not rule it out unless there is a justified argument. > > Thank you for the heads up on this. > > Lewis > > On Fri, May 11, 2012 at 7:41 AM, Peter Ansell <[email protected]> wrote: >> Hi all, >> >> Over the past two days I have split up Any23 into a variety of modules >> to make it easier to use different parts of the Any23 API. You can see >> the code at [1]. The current module list in the parent pom reactor >> looks like: >> >> <modules> >> <module>api</module> >> <module>csvutils</module> >> <module>encoding</module> >> <module>mime</module> >> <module>core</module> >> <module>test-resources</module> >> <module>extractor</module> >> <module>cli</module> >> <module>test</module> >> <module>service</module> >> <module>plugins/basic-crawler</module> >> <module>plugins/html-scraper</module> >> <module>plugins/office-scraper</module> >> <module>plugins/integration-test</module> >> <module>sources-dist</module> >> </modules> >> >> All of the modules above core do not have dependencies on core, and >> the core module only has a dependency on the api module. >> >> The api module mostly contains interfaces but it also contains factory >> registries where they are fully Service Provider Interface (SPI) >> driven (Any23PluginManager and WriterFactoryRegistry which I created >> to alleviate the WriterRegistry hardcoding dependencies and >> reflection/annotation code that isn't easy to extend outside of the >> core library). The ExtractoryRegistry was too difficult to convert to >> SPI just yet so I split it up into an interface and an implementation >> (ExtractorRegistryImpl) with the interface in the API module and used >> in some APIs where the singleton was previously used. These >> registries, together with Rio RDFFormat for referencing RDF format >> information, seemed to be enough to remove the hardcoding that I have >> been discussing at https://issues.apache.org/jira/browse/ANY23-83 >> >> The changes fit my purposes as I can easily slot in the encoding and >> mime detection code without pulling in the core or extractor modules, >> and the supported types for the mime detection include any formats I >> register with OpenRDF Rio so it is extensible and modular for my >> purposes. >> >> However, most of the changes are too large for easy patching and I >> didn't arrange the changes into nice patches throughout as I was not >> sure what was going to happen in the end. I have submitted two very >> small patches to that issue, but there could be many more eventually >> if the redesigned code is acceptable. >> >> Note, I also removed the Any23 NQuads implementation as it was missing >> Factory implementations for the writer and parser classes so it wasn't >> being picked up by Rio.createParser or any of the other static Rio >> methods. I replaced it with the NQuads implementation from Sesametools >> which includes these factories and so is recognised. When >> http://www.openrdf.org/issues/browse/SES-802 gets implemented both of >> these implementations will likely be deprecated anyway so it wasn't a >> major issue for me. I would suggest in either case splitting out the >> NQuads classes into a separate module and implementing a Factory for >> both the parser and writer so they are picked up by SPI. >> >> There were some existing broken tests when I started, and there were a >> small number of tests that broke throughout, including one that broke >> when I updated to Tika-1.1. They are temporarily ignored, but can be >> found easily by checking the ignored tests when running the test >> suite. >> >> I hope the changes are useful to others. >> >> If you want to suggest changes to my version on GitHub feel free to >> open an issue or fork the repository and send a pull request back. >> >> Cheers, >> >> Peter >> >> [1] https://github.com/ansell/any23 > > > > -- > Lewis
