thanks Barry,

I'm not very sure if paralellizing could work, because it has a delay of 
seconds to get a single image from this bucket.

I should open several threads, and probably I'm going to have problems with 
another limitations

On Thursday, August 22, 2019 at 6:56:50 PM UTC+2, barryhunter wrote:
>
> Well putting them into 'static' files in an app wont work! Its a bit 
> hidden, but there is a 10,000 file limit
>
> https://cloud.google.com/appengine/docs/standard/python/how-requests-are-handled
>
> ... Plus you dont really 'upload' apps incrementally, so would need 
> 'somewhere' to first download entire dataset, package it, then upload. 
> Doubt that would be any easy process with one 1TB (Even if can workaround 
> the 10,000 file limit! )
>
>
> You *could *upload the data to https://cloud.google.com/storage/ - which 
> is roughly comparable with S3 Bucket. But again you will be downloading all 
> the data, uploading to Cloud Storage, then just downloading them *again* for 
> use in the process. (downloading  from Cloud Storage, is going to be 
> roughly comparable from S3, maybe bit quicker, but not massively) 
>
>
> ... seems wasteful. You going to have to download the data to anyway, so 
> just download from AWS, and use *directly*. It might be painful, but 
> should work. If you find AWS slow, then download iamges in parallel (while 
> individual images might be relatively slow, it can sustain high (even 
> massive) concurrency - ie downloading lots of images at once!) 
>
>
> This is an exercise in concurrent processing and throughput. Don't get 
> *distracted 
> *trying to build another storage platform, its unlikely you will do *better 
> *than S3. 
>
>
>
>
>
>
>
> On Thu, Aug 22, 2019 at 5:18 PM ALT-EMAIL Virilo Tejedor <
> [email protected] <javascript:>> wrote:
>
>> Hi all,
>>
>> I'd like to create a static web server to store almost 1 TB of images.
>>
>> It is an opensource dataset that I'd like to use to train a Deep Learning 
>> model.
>>
>> I have free usage of GPUs and Internet conexion in another plattform, but 
>> they don't provide me 1 TB storage.
>>
>> I've also 600$ credits in Google Cloud, I was wondering if there was an 
>> easy way to create something to feed with images the server in the other 
>> plattform.
>>
>> The datasource is available as an AWS bucket.  I tried to connect the GPU 
>> machine directly to the ASW bucket via awscli, but it is too much slow.  
>> Like if the bucket were thought for a complete sync but not for coninuous 
>> random access to the files.
>>
>> I've though two possible approaches:
>>
>>         - Execute a python script in GAE to download the dataset and to 
>> create a GAE web server: 
>> https://cloud.google.com/appengine/docs/standard/python/getting-started/hosting-a-static-website
>>
>>         - Execute a python script in GAE to download the dataset and to 
>> create a Google Cloud CDN.
>>
>> Do you think any of this approaches are valid to feed the model during 
>> the training?
>>
>> I'm a newbie in GAE and any help, starting point or idea will be very 
>> wellcomed.
>>
>> Thanks in advance
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Google App Engine" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/google-appengine/dbd0a8f8-859b-4f50-a108-80b21e27267f%40googlegroups.com
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/cbdc86dd-9ef5-4c64-a66d-1d10c91006bd%40googlegroups.com.

Reply via email to