2016-08-19 5:05 GMT-03:00 Sven Van Caekenberghe <[email protected]>:

>
> > On 19 Aug 2016, at 09:44, Hernán Morales Durand <
> [email protected]> wrote:
> >
> >
> > 2016-08-18 16:57 GMT-03:00 Sven Van Caekenberghe <[email protected]>:
> >
> > > On 18 Aug 2016, at 18:35, Hernán Morales Durand <
> [email protected]> wrote:
> > >
> > > Hi Sven!
> > >
> > > Yes, this is lot of requests to a public database named GenBank
> through a public API called "Entrez". I repeated the test today to send you
> a log of what's happening. The requests are done in batches of 500 uid's
> because they limit/penalize huge requests. Each uid is human meta-data, I
> have requested 20,000 samples, and Argentina has one of the worst
> connections in the world.
> >
> > I see that you do 10 request concurrently, correct ? In different
> threads/processes ?
> >
> >
> > Mmmm no, I wrote #splitDownload methods to handle batch requests
> (historically NCBI limits to 3 requests per second)... This is the code
> (entrezUrlUidLimit = 500)
>
> Ah, I think I know what happened. You probably called #logToTranscript
> multiple times with an older version of Zinc. It seems as if every log
> event is displayed multiple times. The code should read:
>
> ZnLogEvent class>>#logToTranscript
>         self stopLoggingToTranscript.
>         ^ self announcer when: ZnLogEvent do: [ :event | Transcript
> crShow: event ].
>
> ZnLogEvent class>>stopLoggingToTranscript
>         self announcer unsubscribe: self
>
> > >>splitDownload
> >
> >     | index splittedRs |
> >
> >     splittedRs := OrderedCollection new: self entrezUrlUidLimit.
> >     index := 1.
> >     self uids do: [: id  |
> >             splittedRs add: id.
> >             index \\ self entrezUrlUidLimit = 0
> >                 ifTrue: [
> >                     self bioLog: 'Requesting ' , splittedRs size
> asString , ' records to Entrez'.
> >                     self results add: (self genBankFetchRecords:
> splittedRs).
> >                     splittedRs := OrderedCollection new: self
> entrezUrlUidLimit ].
> >             index := index + 1 ]
> >         displayingProgress: 'Downloading...' translated.
> >     " Add remaining records "
> >     self bioLog: 'Requesting ' , splittedRs size asString , ' records to
> Entrez'.
> >     self results add: (self genBankFetchRecords: splittedRs).
> >     ^ self results
> >
> > Then I just isolated the code in Pharo 5 and did a break (Alt + . on
> Windows).
> >
> > 2016-08-19 04:06:33 Requesting 500 records to Entrez
> > 2016-08-19 04:06:33 Executing...BioEFetchSeq
> > 2016-08-19 04:07:17 Requesting 500 records to Entrez
> > 2016-08-19 04:07:17 Executing...BioEFetchSeq
> > 2016-08-19 04:08:20 Requesting 500 records to Entrez
> > 2016-08-19 04:08:20 Executing...BioEFetchSeq
> > 2016-08-19 04:09:28 Requesting 500 records to Entrez
> > 2016-08-19 04:09:28 Executing...BioEFetchSeq
> > 2016-08-19 04:10:34 Requesting 500 records to Entrez
> > 2016-08-19 04:10:34 Executing...BioEFetchSeq
> > 2016-08-19 04:12:09 Requesting 500 records to Entrez
> > 2016-08-19 04:12:09 Executing...BioEFetchSeq
> > 2016-08-19 04:14:05 Requesting 500 records to Entrez
> > 2016-08-19 04:14:05 Executing...BioEFetchSeq
> > 2016-08-19 04:14:54 Requesting 500 records to Entrez
> > 2016-08-19 04:14:54 Executing...BioEFetchSeq
> > 2016-08-19 04:15:58 Requesting 500 records to Entrez
> > 2016-08-19 04:15:58 Executing...BioEFetchSeq
> > 2016-08-19 04:17:38 Requesting 500 records to Entrez
> > 2016-08-19 04:17:38 Executing...BioEFetchSeq
> > 2016-08-19 04:19:12 Requesting 500 records to Entrez
> > 2016-08-19 04:19:12 Executing...BioEFetchSeq
> > 2016-08-19 04:21:01 Requesting 500 records to Entrez
> > 2016-08-19 04:21:01 Executing...BioEFetchSeq
> > 2016-08-19 04:22:41 Requesting 500 records to Entrez
> > 2016-08-19 04:22:41 Executing...BioEFetchSeq
> > 2016-08-19 04:24:17 Requesting 500 records to Entrez
> > 2016-08-19 04:24:17 Executing...BioEFetchSeq
> > 2016-08-19 04:25:29 Requesting 500 records to Entrez
> > 2016-08-19 04:25:29 Executing...BioEFetchSeq
> > 2016-08-19 04:26:34 Requesting 500 records to Entrez
> > 2016-08-19 04:26:34 Executing...BioEFetchSeq
> > 2016-08-19 04:28:48 Requesting 500 records to Entrez
> > 2016-08-19 04:28:48 Executing...BioEFetchSeq
> > 2016-08-19 04:29:37 Requesting 500 records to Entrez
> > 2016-08-19 04:29:37 Executing...BioEFetchSeq
> > 2016-08-19 04:30:53 Requesting 500 records to Entrez
> > 2016-08-19 04:30:53 Executing...BioEFetchSeq
> > 2016-08-19 04:32:43 Requesting 500 records to Entrez
> > 2016-08-19 04:32:43 Executing...BioEFetchSeq
> > 2016-08-19 04:33:56 Requesting 500 records to Entrez
> > 2016-08-19 04:33:56 Executing...BioEFetchSeq
> > 2016-08-19 04:34:29 Requesting 500 records to Entrez
> > 2016-08-19 04:34:29 Executing...BioEFetchSeq
> >
> > And now Transcript displays the intervals (all at once of course),
> something weird is happening there.
>
> It is really hard for me to see the context, which is probably complex.
> Unless you isolate the problem for me so that I can run it in a standard
> image, I won't be able to say much more.
>
>
Here it is with an isolated image:

http://igevet.fcv.unlp.edu.ar/V9T8NT6S01R-Alignment.zip
(mirror just in case)
https://dl.dropboxusercontent.com/u/103833630/V9T8NT6S01R-Alignment.zip

(Some classes could be missing)


> > Another problem I have is the Transcript writes everything once the
> exception is signaled :(
> > > Is there a way to revert to the old behavior where each Transcript
> show: writes to Transcript in situ?
> >
> > Do you block the UI thread (while running your main top level
> expression) ? Try forking.
> >
> >
> > I know, but I don't want my users opening and closing critical windows
> while downloading data causing more disasters.
> > Is there a way to rollback to older behavior where a Transcript show:
> had higher priority? I don't know why this was changed, it is *critical* to
> see what happens in real time!
>
> To see what I mean, try both of the following expressions with a clean
> Transcript:
>
> 1 to: 10 do: [ :n | ('Now doing ', n asString) crLog. 1 second wait ].
>
> [ 1 to: 10 do: [ :n | ('Now doing ', n asString) crLog. 1 second wait ] ]
> fork.
>
> If you want to signal progress, there are (better) options for that.
>
> Have a look at ZnClientTest>>#testProgress or even 
> MCHttpRepository>>#displayProgress:during:
> for examples.
>
>
Cool! Thanks, I will check as soon as I can.

Hernán



> >
> > > The ClosedConnection didn't signaled today, but OutOfMemory.
> >
> > See above.
> >
> >
> > Thanks Sven
> >
> > Hernán
>
>
>

Reply via email to