Re: about the supported input format of any23

armon Thu, 21 Jun 2012 15:11:52 -0700

 yep,so how to solve it, BTW, it still can't work while I save the xml part of 
the data in 
http://en.wikipedia.org/w/api.php?action=query&list=search&srwhat=text&srsearch=meaning
 , the xml file is in the attachment file.




armon


On 2012年6月22日星期五 at 上午5:59, Lewis John Mcgibbney wrote:

> No your doing nothing incorrectly. I get pretty dismal results both
> with basic-crawler within Any23 please see below
> 
> lewismc@lewismc-HP-Mini-110-3100:~/ASF/trunk/runtime/local$ any23
> rover 
> http://en.wikipedia.org/w/api.php?action=query&list=search&srwhat=text&srsearch=meaning
> [1] 2956
> [2] 2957
> [3] 2958
> lewismc@lewismc-HP-Mini-110-3100:~/ASF/trunk/runtime/local$
> ------------------------------------------------------------------------
> Apache Any23 :: rover
> ------------------------------------------------------------------------
> 
> @prefix dcterms: <http://purl.org/dc/terms/> .
> 
> <http://en.wikipedia.org/w/api.php?action=query> dcterms:title
> "MediaWiki API Result" .
> 
> ------------------------------------------------------------------------
> Apache Any23 SUCCESS
> Total time: 2s
> Finished at: Thu Jun 21 22:53:27 BST 2012
> Final Memory: 24M/483M
> ------------------------------------------------------------
> [1] Done any23 rover
> http://en.wikipedia.org/w/api.php?action=query
> [2]- Done list=search
> [3]+ Done srwhat=text
> 
> The problem is that I don't know how crawler4j deals with some
> characters such as '?' within URL strings. and whether it treats them
> as queries or not? By the looks of the log output above, the URL
> string is being treated incorrectly.
> 
> Sitting above all of this is the fact that I don't think the wiki
> markup syntax is not supported within Any23 parser implementations.
> 
> Lewis
> 
> 
> On Thu, Jun 21, 2012 at 10:29 PM, armon <[email protected] 
> (mailto:[email protected])> wrote:
> > and even when I copy the xml part of data in the url as the input content,
> > it still can't work well, but when I try a rdf file, it works well, is
> > there anything I do incorrectly?
> > 
> > 
> > 2012/6/22 armon <[email protected] (mailto:[email protected])>
> > 
> > > Hi Lewis, thanks very much for your reply, I am sorry to interrupt you so
> > > late,
> > > 
> > > the url I used was:
> > > 
> > > 
> > > http://en.wikipedia.org/w/api.php?action=query&list=search&srwhat=text&srsearch=meaning
> > > 
> > > 
> > > and then I used command: ./any23 rover url(showed above) to run the
> > > result.
> > > 
> > > thanks.
> > > 
> > > armon
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 2012/6/22 Lewis John Mcgibbney <[email protected] 
> > > (mailto:[email protected])>
> > > 
> > > > Hi Armon,
> > > > 
> > > > On Thu, Jun 21, 2012 at 4:15 PM, armon <[email protected] 
> > > > (mailto:[email protected])> wrote:
> > > > > Hi,
> > > > >  I do some data transform currently from xml-format wiki data
> > > > 
> > > > Can you give a small example of this xml?
> > > > 
> > > > > (retrieved by wikipedia API) to turtle,
> > > > 
> > > > Also a small example of your turtle
> > > > 
> > > > > but it seems that the any23 can't
> > > > > work correctly. (I used the command: ./any23 rover url )
> > > > 
> > > > What do you get to std out? I am easily able to use any23 parsers on
> > > > fetching structure from wikipedia pages... but this is not what you
> > > > are referring to... I need some more information from you please.
> > > > 
> > > > > 
> > > > >  Does any23 actually support the xml data retrieved by wikipedia
> > > > API
> > > > > as the input format ?
> > > > 
> > > > Please see above
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > --
> > > > Lewis
> 
> 
> 
> -- 
> Lewis

Re: about the supported input format of any23

Reply via email to