nutch-site.xml, thats what I meant
so the syntax in index.parse.md would be:
metatag.og:image,metatag.og:image:alt?
Since we are at it.
Rel-tag, I believe we have a plugin for this.. but from what I gather it
only extracts the "rel=tag" and no other "rel" tags, or am I mistaken?
and
I
Yes, of course, defining properties in the nutch-site.xml (but not "site.xml")
does also work. It's the usual hiearchy:
bin/nutch command -Dkey=value ...
overwrites property in nutch-site.xml
(must be on classpath: runtime/local/conf resp. inside the nutch.job)
overwrites definition in n
PS: Does this work when configured in site.xml like regular metatdata?
On Tue, Jun 12, 2018 at 1:31 PM BlackIce wrote:
> sweet thnx!
>
> On Tue, Jun 12, 2018 at 1:29 PM Sebastian Nagel <
> wastl.na...@googlemail.com> wrote:
>
>> > stoopid question, but I can't find any info on it... can we now p
sweet thnx!
On Tue, Jun 12, 2018 at 1:29 PM Sebastian Nagel
wrote:
> > stoopid question, but I can't find any info on it... can we now parse
> Open
> > Graph metatags?
>
> parse-tika extracts og:* metatags
>
> % bin/nutch parsechecker -Dplugin.includes='protocol-http|parse-tika'
> http://ogp.me/
> stoopid question, but I can't find any info on it... can we now parse Open
> Graph metatags?
parse-tika extracts og:* metatags
% bin/nutch parsechecker -Dplugin.includes='protocol-http|parse-tika'
http://ogp.me/
...
Parse Metadata: og:image=http://ogp.me/logo.png og:type=website
og:image:widt
+1 Nice work all!
On 11-06-18 23:44, BlackIce wrote:
+1
stoopid question, but I can't find any info on it... can we now parse Open
Graph metatags?
Greetz
On Mon, Jun 11, 2018 at 9:11 PM Roannel Fernández Hernández
wrote:
+1
Regards
- Chris Mattmann escribió:
++1!
Sounds great.
+1
stoopid question, but I can't find any info on it... can we now parse Open
Graph metatags?
Greetz
On Mon, Jun 11, 2018 at 9:11 PM Roannel Fernández Hernández
wrote:
> +1
>
> Regards
>
> - Chris Mattmann escribió:
> > ++1!
> >
> >
> >
> > Sounds great.
> >
> >
> >
> > Cheers,
> >
> > Ch
+1
Regards
- Chris Mattmann escribió:
> ++1!
>
>
>
> Sounds great.
>
>
>
> Cheers,
>
> Chris
>
>
>
>
>
>
>
>
>
> From: Sebastian Nagel
> Reply-To: "d...@nutch.apache.org"
> Date: Monday, June 11, 2018 at 7:35 AM
> To: "user@nutch.apache.org"
> Cc: "d...@nutch.apache.
8 matches
Mail list logo