Thank you Lewis!

Then I should assume that 2.3 is "broken"? I'll try the upcoming 2.4 as you
suggested.

I still have one more question, apart from that, what is the best way to
debug Any23 issues like this?

On Wed, Sep 30, 2020 at 8:26 PM Lewis John McGibbney <[email protected]>
wrote:

> Hi Mauro,
>
> On 2020/09/24 11:28:24, Mauro Asprea <[email protected]> wrote:
> > Hello, what am I doing wrong?
> >
> > I downloaded the CLI binary distribution and verified that google does
> find
> > the embedded LD-JSON triplets as you see here
> >
> https://search.google.com/structured-data/testing-tool#url=https%3A%2F%2Fwww.monster.com%2Fjobs%2Fsearch%2F%3Fq%3DRuby%26where%3DAustin__2C-TX
>
> Just a quick note. It is impossible to know what kind of 'standard' Google
> uses for the structure data testing tool. As you can see, it is also being
> deprecated pretty quickly.
>
> >
> > Then I run any23 but I get no quads/triplets...
> >
> > hamilcar:apache-any23-cli-2.3:> bin/any23 rover -p -s -t "
> > > https://www.monster.com/jobs/search/?q=Ruby&where=Austin__2C-TX"; -f
> json
> > > -e html-embedded-jsonld -l monster-jsonld.log
>
> Using the 2.4 RC#1 (https://dist.apache.org/repos/dist/dev/any23/2.4/) I
> get the following results which include the one Organization, one ItemList
> and 29 itemListElement's
>
> ./bin/any23 rover -p -s -t "
> https://www.monster.com/jobs/search/?q=Ruby&where=Austin__2C-TX"; -f json
> -e html-embedded-jsonld -l monster-jsonld.log -o monster.json
>
> ------------------------------------------------------------------------
> Apache Any23 :: rover
> ------------------------------------------------------------------------
>
> >Summary:
>    -total calls: 1
>    -total triples: 128
>    -total runtime: 30 ms!
>    -tripls/ms: 4
>    -ms/calls: 30
> >Extractor: html-embedded-jsonld
>    -total calls: 1
>    -total triples: 128
>    -total runtime: 30 ms!
>    -tripls/ms: 4
>    -ms/calls: 30
>
> ------------------------------------------------------------------------
> Apache Any23 SUCCESS
> Total time: 2s
> Finished at: Wed Sep 30 11:11:04 PDT 2020
> Final Memory: 107M/367M
> ------------------------------------------------------------------------
> I suggest that you should upgrade to 2.4.
>
> >
> >
> > How can I increase the logging level to see any hidden debug messages?
>
> You would need to literally hack the source code to add  more debug
> logging.
>
> I created a ticket to address the log4j appender issue -
> https://issues.apache.org/jira/browse/ANY23-454
>
> >
> > Also as you can see, this webpage has an embedded LD+JSON script that is
> > not being picked up by the extractor. Help?
> >
>
> If I remove the extractor flag e.g. -e html-embedded-jsonld, then I get
> lots more results. Some of these are however trivial in nature so would
> need to be filtered out.
>
> >Summary:
>    -total calls: 21
>    -total triples: 189
>    -total runtime: 688 ms!
>    -tripls/ms: 0
>    -ms/calls: 32
> >Extractor: html-head-icbm
>    -total calls: 1
>    -total triples: 0
>    -total runtime: 6 ms!
>    -tripls/ms: 0
>    -ms/calls: 6
> >Extractor: html-mf-geo
>    -total calls: 1
>    -total triples: 0
>    -total runtime: 1 ms!
>    -tripls/ms: 0
>    -ms/calls: 1
> >Extractor: html-head-meta
>    -total calls: 1
>    -total triples: 16
>    -total runtime: 5 ms!
>    -tripls/ms: 3
>    -ms/calls: 5
> >Extractor: html-mf-adr
>    -total calls: 1
>    -total triples: 0
>    -total runtime: 1 ms!
>    -tripls/ms: 0
>    -ms/calls: 1
> >Extractor: html-mf-hcalendar
>    -total calls: 1
>    -total triples: 0
>    -total runtime: 1 ms!
>    -tripls/ms: 0
>    -ms/calls: 1
> >Extractor: html-mf-hresume
>    -total calls: 1
>    -total triples: 0
>    -total runtime: 1 ms!
>    -tripls/ms: 0
>    -ms/calls: 1
> >Extractor: html-mf-hreview
>    -total calls: 1
>    -total triples: 0
>    -total runtime: 1 ms!
>    -tripls/ms: 0
>    -ms/calls: 1
> >Extractor: consolidation-extractor
>    -total calls: 1
>    -total triples: 0
>    -total runtime: 0 ms!
>    -ms/calls: 0
> >Extractor: html-xpath
>    -total calls: 1
>    -total triples: 0
>    -total runtime: 0 ms!
>    -ms/calls: 0
> >Extractor: html-head-title
>    -total calls: 1
>    -total triples: 1
>    -total runtime: 1 ms!
>    -tripls/ms: 1
>    -ms/calls: 1
> >Extractor: html-mf-hcard
>    -total calls: 1
>    -total triples: 0
>    -total runtime: 0 ms!
>    -ms/calls: 0
> >Extractor: html-rdfa11
>    -total calls: 1
>    -total triples: 44
>    -total runtime: 33 ms!
>    -tripls/ms: 1
>    -ms/calls: 33
> >Extractor: html-mf-hreview-aggregate
>    -total calls: 1
>    -total triples: 0
>    -total runtime: 1 ms!
>    -tripls/ms: 0
>    -ms/calls: 1
> >Extractor: html-mf-license
>    -total calls: 1
>    -total triples: 0
>    -total runtime: 3 ms!
>    -tripls/ms: 0
>    -ms/calls: 3
> >Extractor: html-mf-xfn
>    -total calls: 1
>    -total triples: 0
>    -total runtime: 2 ms!
>    -tripls/ms: 0
>    -ms/calls: 2
> >Extractor: html-mf-species
>    -total calls: 1
>    -total triples: 0
>    -total runtime: 1 ms!
>    -tripls/ms: 0
>    -ms/calls: 1
> >Extractor: html-mf-hlisting
>    -total calls: 1
>    -total triples: 0
>    -total runtime: 0 ms!
>    -ms/calls: 0
> >Extractor: html-microdata
>    -total calls: 1
>    -total triples: 0
>    -total runtime: 2 ms!
>    -tripls/ms: 0
>    -ms/calls: 2
> >Extractor: html-mf-hrecipe
>    -total calls: 1
>    -total triples: 0
>    -total runtime: 0 ms!
>    -ms/calls: 0
> >Extractor: html-embedded-jsonld
>    -total calls: 1
>    -total triples: 128
>    -total runtime: 627 ms!
>    -tripls/ms: 0
>    -ms/calls: 627
> >Extractor: html-head-links
>    -total calls: 1
>    -total triples: 0
>    -total runtime: 2 ms!
>    -tripls/ms: 0
>    -ms/calls: 2
>


-- 
Mauro Asprea

E-Mail: [email protected]
Mobile: +34 654297582
Keybase: https://keybase.io/brutuscat

Reply via email to