Hi,
I am trying to get a collection of files according to LastModifiedDate from
S3
List <String> FileNames = new ArrayList<String>();
ListObjectsRequest listObjectsRequest = new ListObjectsRequest()
.withBucketName(s3_bucket)
.withPrefix(logs_dir);
ObjectListing objectListing;
do {
objectListing = s3Client.listObjects(listObjectsRequest);
for (S3ObjectSummary objectSummary :
objectListing.getObjectSummaries()) {
if
((objectSummary.getLastModified().compareTo(dayBefore) > 0) &&
(objectSummary.getLastModified().compareTo(dayAfter) <1) &&
objectSummary.getKey().contains(".log"))
FileNames.add(objectSummary.getKey());
}
listObjectsRequest.setMarker(objectListing.getNextMarker());
} while (objectListing.isTruncated());
I would like to process these files using Spark
I understand that textFile reads a single text file. Is there any way to
read all these files that are part of the List?
Thanks for your help.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Read-multiple-files-from-S3-tp22965.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]