Try something like that:
def readGenericRecords(sc: SparkContext, inputDir: String, startDate:
Date, endDate: Date) = {
// assuming a list of paths
val paths: Seq[String] = getInputPaths(inputDir, startDate, endDate)
val job = Job.getInstance(new Configuration(sc.hadoopConfiguration)
def readGenericRecords(sc: SparkContext, inputDir: String, startDate:
Date, endDate: Date) = {
val path = getInputPaths(inputDir, startDate, endDate)
sc.newAPIHadoopFile[AvroKey[GenericRecord], NullWritable,
AvroKeyInputFormat[GenericRecord]]("/A/B/C/D/D/2015/05/22/out-r-*.avro")
}
T
You can do that using FileInputFormat.addInputPath
2015-05-27 10:41 GMT+02:00 ayan guha :
> What about /blah/*/blah/out*.avro?
> On 27 May 2015 18:08, "ÐΞ€ρ@Ҝ (๏̯͡๏)" wrote:
>
>> I am doing that now.
>> Is there no other way ?
>>
>> On Wed, May 27, 2015 at 12:40 PM, Akhil Das
>> wrote:
>>
>>> H
What about /blah/*/blah/out*.avro?
On 27 May 2015 18:08, "ÐΞ€ρ@Ҝ (๏̯͡๏)" wrote:
> I am doing that now.
> Is there no other way ?
>
> On Wed, May 27, 2015 at 12:40 PM, Akhil Das
> wrote:
>
>> How about creating two and union [ sc.union(first, second) ] them?
>>
>> Thanks
>> Best Regards
>>
>> On
I am doing that now.
Is there no other way ?
On Wed, May 27, 2015 at 12:40 PM, Akhil Das
wrote:
> How about creating two and union [ sc.union(first, second) ] them?
>
> Thanks
> Best Regards
>
> On Wed, May 27, 2015 at 11:51 AM, ÐΞ€ρ@Ҝ (๏̯͡๏)
> wrote:
>
>> I have this piece
>>
>> sc.newAPIHadoo
How about creating two and union [ sc.union(first, second) ] them?
Thanks
Best Regards
On Wed, May 27, 2015 at 11:51 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) wrote:
> I have this piece
>
> sc.newAPIHadoopFile[AvroKey[GenericRecord], NullWritable,
> AvroKeyInputFormat[GenericRecord]](
> "/a/b/c/d/exptsession/2015/05/2
I have this piece
sc.newAPIHadoopFile[AvroKey[GenericRecord], NullWritable,
AvroKeyInputFormat[GenericRecord]](
"/a/b/c/d/exptsession/2015/05/22/out-r-*.avro")
that takes ("/a/b/c/d/exptsession/2015/05/22/out-r-*.avro") this as input.
I want to give a second directory as input but this is a inva