Re: [Pdl-devel] Faster PDL Development Cycle---But How?

dhunt Mon, 31 Aug 2015 08:15:17 -0700

Hi all: I would like to echo Craig's concern about keeping somestability with at least one version of PDL.Here at UCAR we have been using PDL almost since the project started.We currently use PDL in production processing of GPS Radio Occultationdata that is used by all major world weather centers for theirforecasts. We are planning on using many thousands of lines of PDL codein an upcoming NOAA satellite mission that will also feed data inreal-time to weather centers.

So, while work on PDL3 or PDLA is interesting and we will try to adoptit in time, we have a need for some stability!


Thanks much,

  Doug Hunt


On 8/30/15 8:51 PM, Craig DeForest wrote:

I apologize to everyone for staying silent so long. I support the planfor a dual pathway. I have been of two minds about the currentresurgence: on the one hand I am delighted that at the work Aki and Edare doing; on the other hand I have been worried for stability ofexisting projects of some importance that build on PDL (for example,PDL is used to identify new emerging magnetic flux on the Sun inreal-time by the SDO mission; to carry out fundamental data reductionsteps for STEREO; and as the basis of many unique hybrid C/Perl codesof some complexity). Developing the new core and/or split under PDLAseems like a great way to go, much like the even/odd version numbersplit in Perl itself: it mitigates the risk of breaking existingimportant tools, while keeping further development from bogging down.Well done.

My main remaining concern is that, if PDLA were to become a true forkof PDL, not feeding back into the legacy namespace at intervals, itcould derail both projects. I hope that a permanent fork is not theintent of PDLA...


(Mobile)

On Aug 30, 2015, at 6:50 PM, Chris Marshall <[email protected]<mailto:[email protected]>> wrote:

All-

Here is a status update on "the big split" from my
perception and the accelerating PDLA development to
that effect.  Ed/Zaki, please amend/comment.

First off, while making a PDLA:: agile copy of PDL::
seemed clunky to me, it ended up being the _perfect_
answer to my concerns about breaking PDL usability
while the split and architecture are evolving.  The
PDLA work is already going and I'm relieved that the
stability and usability of PDL is being maintained,
even if not understood by all developers.

Second, the "big split" boils down to migrating all
PDL modules that depend on external libraries into
their own distributions on CPAN.  This will make their
development more agile as well but there may be some
hiccups in the process so be prepared.  You should
always have a version of the existing, monolithic PDL
distribution to fall back on.

Third, any current or lurking PDL developers should
read and participate in this discussion or forever
hold your piece.  We are making these changes and,
while I've been slow to move because of my concerns
about the transition and PDL usability for new or
non-guru users, I'm fully behind the PDLA and PDL::Core
and PDL3 work.

Finally, kmx has fixed the outstanding build problems
in our longlong-double-fix branch.  We need folks to
test this and add tests for large memory operations:
- manipulation of large piddles
- mapflex of large files
- make sure things don't break on 16GiB objects,...

With this fix in a PDL-2.014 release, I think we'll
have the perfect metastable version of the existing
PDL distribution to rest on while the PDLA split and
development takes place.

Regards,
Chris


On 8/30/2015 12:36, Chris Marshall wrote:

On 8/25/2015 13:42, Zakariyya Mughal wrote:

On 2015-08-24 at 23:48:51 +0000, Chris Marshall wrote:

PDL Developers-

     With the addition of two active and highly motivated PDL
developers
(Zakariyya Mughal and Guggle "Ed" Worth) we've made significant
progress
in cleaning up the PDL distribution itself and the development process

itself. PDL is now run through test builds automatically on gitcommit

via the Travis-CI framework of github.  Many perl platforms and PDL
configuration options are exercised.  PDL-2.013 was the best tested
pre-release release ever.
...snip...

Let the discussions begin!

Hello Chris,

First off, thank you for starting this conversation.

Ed and I have been working on and off as time permits on preparing for
the split. The work we've been doing hasn't really generated much
traffic on the pdl-devel mailing list, but the #pdl and PDLPorters
GitHub organisation shows a very different story. There is a lot going

on there every few days. The discussion on those two mediums is alittle

more agile than the mailing list or SourceForge and helps with
formulating

I highly recommend joining both by watching the repositories in
PDLPorters and following the IRC by either joining in a client or
tracking the backlog with <http://irclog.perlgeek.de/pdl/>.

I also recommend participating in the mailing list. Collectedinformation

such as you have provided is the only way to track complicated
discussions
on #pdl or other irc sessions.  Thanks for the translation, Zaki.


I'd like to summarise some of what we came up with on GitHub/IRC:

 1. A split is necessary to not only make releases easier, but also
    development. We have worked on reducing the time required to build
    PDL across multiple environments down to a little over 1 hour.

    This is still too long when you have perhaps 1.5 hours of tuits
that
    day. So the work inevitably gets spread out over weeks.

    A split would help decrease this friction.

 2. Making `cpanm PDL` always work has always part of the plan.
    Improving the PDL devops has helped with that. The plan is to
    continue doing that.

    But large refactors such as this split can be quite daunting. We
    can't be sure we will stick the landing right the first time. But
    the job needs to move forward or it will fail via analysis
paralysis
    even before it has begun.

 3. Ed and I have been thinking about releasing a more agile, friendly
    fork of PDL under the PDLA namespace (for PDL Agile). The
    repositories will continue to live under the PDLPorters GitHub
    organisation.

    We will start by applying the split. This will be followed by
    improving code coverage, fixes to the 64-bit indexing, formalising
    the badvalue semantics for more functions, and bug-fixes.

    We plan on making sure that libraries such as PDL-Stats,
PDL-IO-CSV,
    etc. remain compatible with this library. I believe there is a way
    to do this without making changes to the original code (via a
subref
    in @INC).
   4. The modules that come from the split will each be improved so
that
    they are easy to install on their own. We already have plans to
    write Alien::Base modules for all of them.

 5. In parallel with this, we will begin reaching out to distribution
    packagers. PDL has not been updated on many of them (some of which
    are on 2.4.x). This is already on the wishlist at
<https://github.com/PDLPorters/pdl/issues/139>.

 6. The current PDL distribution will remain as it is. Bugfixes will
    continue on PDL and they will be backported from PDLA. This
approach
    has worked well for IPython/Jupyter (which underwent a split
earlier
    this summer)[^jupyter-split]. Back porting fixes was a large part
    of what they had to go through.

 7. Eventually, after we are sure that PDLA has maintained
    compatibility with PDL, the changes of PDLA will replace the
    current PDL repository.

Finally, I also have some ideas for PDL3 that I will post in about a
month's time. One of the top priorities on the feature list of PDL3's C
API needs to be the ability to do optmisations such as loop fusion. I
need to ponder on how to combine this with the Moo-like metaprogramming
that we envision. The Julia developers seem to be working on this, but
there are still big unresolved questions on the issue tracker.

By the way, I think it might be better to avoid putting a number in the
name of this next major version of PDL. It's a personal opinion that
stems from marketing issues that are similar to what happened with
Osborne 1 <https://en.wikipedia.org/wiki/Osborne_effect> and somewhat
with Perl 6. This isn't a strongly held opinion, but I feel that it is
worth bringing up.


The PDL3 moniker is just a way to identify the "new architecture/api
work"
from the existing PDL-2.x engine for reference to previous
discussions.  I
agree that putting numbers in module names is not good.

--Chris



------------------------------------------------------------------------------
_______________________________________________
pdl-devel mailing list
[email protected] <mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/pdl-devel



------------------------------------------------------------------------------


_______________________________________________
pdl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pdl-devel

------------------------------------------------------------------------------

_______________________________________________
pdl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pdl-devel

Re: [Pdl-devel] Faster PDL Development Cycle---But How?

Reply via email to