Hi Hady,
On Sun, Nov 24, 2013 at 9:40 PM, Hady elsahar <[email protected]> wrote:
> Hello All ,
>
> considering the issue
> #38<https://github.com/dbpedia/extraction-framework/issues/38> refactoring
> the core to accept new formats , i guess the new core
> functionality is working now , what's needed is some modifications as well
> as your suggestions for updates and of course merging to the main branch
>
> what was done so far :
>
> 1- change Extractor Trait to accept [T] type argument [see
> commit<https://github.com/hadyelsahar/extraction-framework/commit/e26ef813dad098d573be34191dfaef13c78b5986>
> ]
> 2- change CompostiteExtractor class to load any type of classes not only
> PageNode [see
> commit<https://github.com/hadyelsahar/extraction-framework/commit/17dcaa8b2988e7fc8676532fa849fff1eabec9d0>
> ]
>
> 3- Refactoring the core [see commit
> <https://github.com/hadyelsahar/extraction-framework/commit/9ad75cd864d12025d2872b4e3c6cbe4d4fae3681>
> ]
>
> - added (loadToParsers) method to CompositeExtractor this method will
> :
>
> - take a list of Extractors and split them by the type they accepts
> - create JsonParseExtractor object and load it with Extractor[Json
> format]
> - create WikiParseExtractor object and load it with
> Extractor[PageNode]
> - create CompositeExtractor object and load it with
> Extractor[WikiPage]
>
> - Created ParseExtractor class which :
>
> - takes WikiPageFormat as an argument and decide suitable parser for
> it
> - get loaded with Extractors
> - in runtime check if page has proper WikiPageFormat if so ,parse
> it by the parse and pass it to all inner Extractors
> - WikiParseExtractor , CompositeExtractor are instances of the same
> class ParseExtractor but with different WikiPageFormat Argument
>
> good!
*Next Steps : *
>
> 1- Loading WikiData Extractors created in the GSoC project to this branch
>
go ahead
2- in CompositeExtractor , in order we check for Extractor[T] , T is
> erased in runtime so the cleanest way is to use Scala TypeTag which need
> scala 2.10 so :
>
> - as a work around i added a Type enumerator at Extractor Class
> - future work would be installing scala 2.10 , then replacing the enum
> with check for TypeTags
>
> We talked about this and we both don't like it :)
creating super classes for WikiPageExtractor, PageNodeExtractor,
JsonExtractor would result in less code but since we'll change it anyway in
2.10 leave it like this and we will fix it after the merge
> 3- Get rid of the RootExtractor
>
> *Questions:*
> 1- Any suggestions or modifications needed ?
>
I think there are some things that could be improved but we need to see the
whole picture first. Let's not waste further time discussing design, go
ahead and create a working draft first and we can always improve later
2- the only difference now than JC's
Design<https://f.cloud.github.com/assets/607468/363286/1f8da62c-a1ff-11e2-99c3-bb5136accc07.png>
is
> that PraseExtractor passes WikiPage to all inner Extractor instead of
> collecting them in one CompositeExtractor
> it doesn't really add any new functionality just following the pattern .
> so do you think we should add it ?
>
I think my comment above covers your question :)
Good work Hady!
Best,
Dimitris
>
>
> thanks
> Regards
>
> -------------------------------------------------
> Hady El-Sahar
> Research Assistant
> Center of Informatics Sciences | Nile
> University<http://nileuniversity.edu.eg/>
>
>
>
>
> ------------------------------------------------------------------------------
> Shape the Mobile Experience: Free Subscription
> Software experts and developers: Be at the forefront of tech innovation.
> Intel(R) Software Adrenaline delivers strategic insight and game-changing
> conversations that shape the rapidly evolving mobile landscape. Sign up
> now.
> http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk
> _______________________________________________
> Dbpedia-developers mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
>
>
--
Dimitris Kontokostas
Department of Computer Science, University of Leipzig
Research Group: http://aksw.org
Homepage:http://aksw.org/DimitrisKontokostas
------------------------------------------------------------------------------
Shape the Mobile Experience: Free Subscription
Software experts and developers: Be at the forefront of tech innovation.
Intel(R) Software Adrenaline delivers strategic insight and game-changing
conversations that shape the rapidly evolving mobile landscape. Sign up now.
http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk
_______________________________________________
Dbpedia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers