Hi Chris, thanks so much for taking care of this. Looking forward to
testing the feature. Cheers,
Paul

--
Paul Rigor
http://www.ics.uci.edu/~prigor


On Mon, Apr 29, 2013 at 11:03 AM, Christopher Lee <l...@chem.ucla.edu>wrote:

> Hi Paul,
> sorry, this mix of Python, Pyrex and C code is awfully dense and hard to
> make sense of.  However, based on looking at the code a few days ago, I
> think the task of limiting the number of files that are opened at once
> during readMAFfiles() can be done in the relatively simple way I outlined.
>  The build_ifile array is completely internal to readMAFfiles(); it is not
> passed to any other function.  Note that the later call to buildFiles()
> (and hence to each NLMSASequence.build_files()) as its first step simply
> closes the build_ifile on each NLMSASequence (we'd only need to make very
> minor adjustments to that code).  So once readMAFfiles() is done writing,
> everything else is done one file at a time.  Thus our task really does not
> extend outside readMAFfiles() itself.
>
> It sounds like it'd be most efficient if I try to write code for this over
> the next few days, then you can take a look at the changes and see what you
> think...
>
> The question of limiting the number of files that are opened during
> regular usage of the NLMSA (i.e. querying the alignment database) is
> completely separate.  I believe the current default mode of opening files
> only "onDemand" should keep the number of files from getting too big.  If
> we need to, we can later add code for again automatically closing some
> files if the number gets too big.
>
> Chris
>
>
>
> On Apr 28, 2013, at 6:03 PM, Paul Rigor wrote:
>
> > Hi Chris,
> >
> > I don't think replacing build_ifile and nbuild arrays with the FileQueue
> you mentioned will be this straight forward. These two variables are used
> outside of the readMAFfiles method, eg, loading indexes later on.
> >
> > Also, there are other implicit counters to nlmsa objects (thus their
> associated interval files) that will need to be maintained, eg, inlmsa and
> self.id. The linear id scheme for the interval files is not obviously
> amenable to LRU caching.
> >
> > Also, the creation of new lpo sequence cannot be easily bound to the new
> filecache -- it's used everywhere and i'm not sure about all of the
> dependencies for other opened file handles. Further, it's unclear how to
> open an associated interval file once it's closed. In other words, the code
> (from the latest git repo) isn't self-explanatory at the moment.
> Additionally, the saveInterval() method is quite confusing. What is it
> actually doing? It's argument list isn't consistent with examples of actual
> calls. The same goes for the newSequence() method.
> >
> > I'm trying to piece together a solution using the LRUcache extension
> from the PyTables project, but not modifying the newSequence() method
> throws things off because of its implicit id generation.
> >
> > What is the best way to isolate the changes? Again, I've just gone
> through the relevant code the past couple of days, so I'm probably
> misunderstanding a few things ;-)
> >
> > Thank you again for your time!
> > Paul
> >
> >
> > --
> > Paul Rigor
> > http://www.ics.uci.edu/~prigor
> >
>
> --
> You received this message because you are subscribed to the Google Groups
> "pygr-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pygr-dev+unsubscr...@googlegroups.com.
> To post to this group, send email to pygr-dev@googlegroups.com.
> Visit this group at http://groups.google.com/group/pygr-dev?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"pygr-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to pygr-dev+unsubscr...@googlegroups.com.
To post to this group, send email to pygr-dev@googlegroups.com.
Visit this group at http://groups.google.com/group/pygr-dev?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to