Ken,

Thanks a lot! You solved my problem.

Thanks,
Eran

On Thu, Oct 29, 2009 at 2:35 PM, Ken Krugler <kkrugler_li...@transpac.com>wrote:

>
> On Oct 29, 2009, at 4:00am, Eran Zinman wrote:
>
>  Hello everyone,
>>
>> I've created a plugin for Nutch 1.0 that extends the parser.
>>
>> This plugin extract several kinds of information from the document DOM.
>>
>> In some cases I need to extract an "href" of a certain link. The link in
>> the
>> DOM is still relative as it was originally written in the html document,
>> so
>> for example it might be a link with an href of "/music".
>>
>> My question is - how can I make this link have an absolute url - for
>> example
>> make "/music" to "http://www.example.com/music";?
>>
>
> new URL(baseUrl, relativeString)
>
> will return the full URL, leaving aside a few minor edge cases.
>
> The baseUrl will be the URL of the containing document, or the value of the
> (potentially relative) location: response header field if it exists, or the
> value of the <base> tag in the <head> element, if that exists.
>
> -- Ken
>
>
> --------------------------
> Ken Krugler
> TransPac Software, Inc.
> <http://www.transpac.com>
> +1 530-210-6378
>
>

Reply via email to