yes , the method of FetchSchedule#getFields() is never used. so maybe we
GenerateJob, we can add an method to get the fields before initMapperJob.
code like this
GeneratorJob#getFields
public Collection<WebPage.Field> getFields(Job job) {
Collection<WebPage.Field> fields = new HashSet<WebPage.Field>(FIELDS);
fields.addAll(FetchScheduleFactory.getFetchSchedule(job.getConfiguration()).getFields(););
return fields;
}
GenerateorJob#run
currentJob = new NutchJob(getConf(), "generate: " +
getConf().get(BATCH_ID));
Collection<WebPage.Field> fields = getFields(currentJob);
StorageUtils.initMapperJob(currentJob, fields, SelectorEntry.class,
WebPage.class, GeneratorMapper.class,
SelectorEntryPartitioner.class, true);
On Thu, Apr 18, 2013 at 1:56 AM, Lewis John Mcgibbney <
[email protected]> wrote:
> Hi Canan,
> On Tue, Apr 16, 2013 at 9:46 PM, <[email protected]> wrote:
>
>>
>> dev Digest 17 Apr 2013 04:46:54 -0000 Issue 1595
>>
>> Re: FetchSchedule and Metadata
>> 23108 by: Canan GİRGİN
>>
>>
>> Hi,
>>
>> Can I open an issue in Jira system about FetchSchedules getFields()
>> method?
>> This method is never used. But it is necessary.
>> I extend AbstractFetchSchedule class for my specific schedule
>> requirements. But I can not use required columns data in my class.(Ex:
>> metadata column)
>>
>
> Yes I would please encourage you to open an issue for this. I am still not
> completely sure what the problem is (and no one else seems to have chimed
> in here) so please explain your problem as verbosely as possible and we can
> work on it.
> Sorry for loosing track of this issue.
> Lewis
>
--
Don't Grow Old, Grow Up... :-)