Hi Kingsley,

Thanks for your advice, I think there's a problem with my copy of Virtuoso
since there was no cartridge to select previously (in crawler's setup page)
and only after I went to RDF > Sponger > Cartridges/Meta cartridges,
unchecked everything and rechecked them back again, they started to appear
in my crawler's setup page to choose from.

Now after crawling with only Freebase as cartridge and Freebase NYTC and
NYTCF as meta cartridges, I only get one page crawled, though it's correctly
sponged and inserted into quad store via rdf_sink. I haven't activated
'Single page download', and I've tried any combination of
http://rdf.freebase.com/ as 'Follow links matching' but still just one page.
And even after activating all the cartridges back, I still don't get more
than 1 page which seems odd to me, because before having the list of
cartridges on the crawler's setup page, it started to crawl perfectly but
the problem was with sponging and noise and all that stuff but now I can't
get it to crawl more than 1 page with any setup.

And there's one more problem which is every time I setup a new crawl job,
the entry for the cartridge gets doubled down there on the RDF Cartridge
part. Please refer to this image: http://i.imgur.com/3KLUx.png

<http://i.imgur.com/3KLUx.png>Please note that: 1- I had previously removed
this instance's virtuoso.db file from disk but it was perfectly (as it
seems) regenerated. 2- By all of cartridges, I mean all except the first one
(HTTP in RDF).

So any ideas on where to start debugging the problem ? I already have tried
enabling Sponger's logging feature but I get nothing in the log. Maybe I
have to reset Crawler's settings or something like that.


Thanks for your time and sorry for my wall of text.

Best,
Parsa

On Thu, Sep 23, 2010 at 7:18 PM, Parsa Ghaffari <[email protected]>wrote:

> Dear all,
>
> I'm trying to make a mashup of DBPedia and Freebase. At a query level I
> know I can use SPARQL pragmas for owl:sameAs but I think it limits me to 2.4
> million interlinked concepts and on the Virtuoso's built-in crawler side, I
> think I'll get some noise (i.e. data other than rdf.freebase.com) and I
> still don't understand the crawler enough to know that if it crawls the
> whole Freebase or not. Can someone address me on these 2 issues please ?
> Thanks a lot.
>
> Best,
> Parsa
>



-- 
Parsa B. Ghaffari

Reply via email to