On 12 May 2012 18:34, Mattmann, Chris A (388J) <
[email protected]> wrote:

> Hi Peter,
>
> Thanks for your help and for a detailed explanation of what you did!
>
> I for one, would be super supportive if you had time to figure out a way
> to get it into Apache Any23. I'm sure the rest of the PPMC would be happy
> and willing to work with you to develop JIRA issues/patches, etc., to
> facilitate this.
>

I would be happy!
+1

Mic

>
> Thank you again for your work!
>

Thanks.
Mic


>
> Cheers,
> Chris
>
> On May 10, 2012, at 8:41 PM, Peter Ansell wrote:
>
> > Hi all,
> >
> > Over the past two days I have split up Any23 into a variety of modules
> > to make it easier to use different parts of the Any23 API. You can see
> > the code at [1]. The current module list in the parent pom reactor
> > looks like:
> >
> >  <modules>
> >    <module>api</module>
> >    <module>csvutils</module>
> >    <module>encoding</module>
> >    <module>mime</module>
> >    <module>core</module>
> >    <module>test-resources</module>
> >    <module>extractor</module>
> >    <module>cli</module>
> >    <module>test</module>
> >    <module>service</module>
> >    <module>plugins/basic-crawler</module>
> >    <module>plugins/html-scraper</module>
> >    <module>plugins/office-scraper</module>
> >    <module>plugins/integration-test</module>
> >    <module>sources-dist</module>
> >  </modules>
> >
> > All of the modules above core do not have dependencies on core, and
> > the core module only has a dependency on the api module.
> >
> > The api module mostly contains interfaces but it also contains factory
> > registries where they are fully Service Provider Interface (SPI)
> > driven (Any23PluginManager and WriterFactoryRegistry which I created
> > to alleviate the WriterRegistry hardcoding dependencies and
> > reflection/annotation code that isn't easy to extend outside of the
> > core library). The ExtractoryRegistry was too difficult to convert to
> > SPI just yet so I split it up into an interface and an implementation
> > (ExtractorRegistryImpl) with the interface in the API module and used
> > in some APIs where the singleton was previously used. These
> > registries, together with Rio RDFFormat for referencing RDF format
> > information, seemed to be enough to remove the hardcoding that I have
> > been discussing at https://issues.apache.org/jira/browse/ANY23-83
> >
> > The changes fit my purposes as I can easily slot in the encoding and
> > mime detection code without pulling in the core or extractor modules,
> > and the supported types for the mime detection include any formats I
> > register with OpenRDF Rio so it is extensible and modular for my
> > purposes.
> >
> > However, most of the changes are too large for easy patching and I
> > didn't arrange the changes into nice patches throughout as I was not
> > sure what was going to happen in the end. I have submitted two very
> > small patches to that issue, but there could be many more eventually
> > if the redesigned code is acceptable.
> >
> > Note, I also removed the Any23 NQuads implementation as it was missing
> > Factory implementations for the writer and parser classes so it wasn't
> > being picked up by Rio.createParser or any of the other static Rio
> > methods. I replaced it with the NQuads implementation from Sesametools
> > which includes these factories and so is recognised. When
> > http://www.openrdf.org/issues/browse/SES-802 gets implemented both of
> > these implementations will likely be deprecated anyway so it wasn't a
> > major issue for me. I would suggest in either case splitting out the
> > NQuads classes into a separate module and implementing a Factory for
> > both the parser and writer so they are picked up by SPI.
> >
> > There were some existing broken tests when I started, and there were a
> > small number of tests that broke throughout, including one that broke
> > when I updated to Tika-1.1. They are temporarily ignored, but can be
> > found easily by checking the ignored tests when running the test
> > suite.
> >
> > I hope the changes are useful to others.
> >
> > If you want to suggest changes to my version on GitHub feel free to
> > open an issue or fork the repository and send a pull request back.
> >
> > Cheers,
> >
> > Peter
> >
> > [1] https://github.com/ansell/any23
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: [email protected]
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>


-- 
Michele Mostarda
Senior Software Engineer
skype: michele.mostarda
twitter: micmos
mail: [email protected]
site : http://www.michelemostarda.com

Reply via email to