Re: [Pdl-devel] Faster PDL Development Cycle---But How?

Chris Marshall Sun, 30 Aug 2015 09:51:11 -0700

All-

Here is a status update on "the big split" from my
perception and the accelerating PDLA development to
that effect.  Ed/Zaki, please amend/comment.


First off, while making a PDLA:: agile copy of PDL::
seemed clunky to me, it ended up being the _perfect_
answer to my concerns about breaking PDL usability
while the split and architecture are evolving.  The
PDLA work is already going and I'm relieved that the
stability and usability of PDL is being maintained,
even if not understood by all developers.

Second, the "big split" boils down to migrating all
PDL modules that depend on external libraries into
their own distributions on CPAN.  This will make their
development more agile as well but there may be some
hiccups in the process so be prepared.  You should
always have a version of the existing, monolithic PDL
distribution to fall back on.

Third, any current or lurking PDL developers should
read and participate in this discussion or forever
hold your piece.  We are making these changes and,
while I've been slow to move because of my concerns
about the transition and PDL usability for new or
non-guru users, I'm fully behind the PDLA and PDL::Core
and PDL3 work.

Finally, kmx has fixed the outstanding build problems
in our longlong-double-fix branch.  We need folks to
test this and add tests for large memory operations:
- manipulation of large piddles
- mapflex of large files
- make sure things don't break on 16GiB objects,...

With this fix in a PDL-2.014 release, I think we'll
have the perfect metastable version of the existing
PDL distribution to rest on while the PDLA split and
development takes place.

Regards,
Chris


On 8/30/2015 12:36, Chris Marshall wrote:
> On 8/25/2015 13:42, Zakariyya Mughal wrote:
>> On 2015-08-24 at 23:48:51 +0000, Chris Marshall wrote:
>>> PDL Developers-
>>>
>>>       With the addition of two active and highly motivated PDL 
>>> developers
>>> (Zakariyya Mughal and Guggle "Ed" Worth) we've made significant 
>>> progress
>>> in cleaning up the PDL distribution itself and the development process
>>> itself.  PDL is now run through test builds automatically on git commit
>>> via the Travis-CI framework of github.  Many perl platforms and PDL
>>> configuration options are exercised.  PDL-2.013 was the best tested
>>> pre-release release ever.
>>> ...snip...
>>>
>>> Let the discussions begin!
>> Hello Chris,
>>
>> First off, thank you for starting this conversation.
>>
>> Ed and I have been working on and off as time permits on preparing for
>> the split. The work we've been doing hasn't really generated much
>> traffic on the pdl-devel mailing list, but the #pdl and PDLPorters
>> GitHub organisation shows a very different story. There is a lot going
>> on there every few days. The discussion on those two mediums is a little
>> more agile than the mailing list or SourceForge and helps with 
>> formulating
>>
>> I highly recommend joining both by watching the repositories in
>> PDLPorters and following the IRC by either joining in a client or
>> tracking the backlog with <http://irclog.perlgeek.de/pdl/>.
>
> I also recommend participating in the mailing list. Collected information
> such as you have provided is the only way to track complicated 
> discussions
> on #pdl or other irc sessions.  Thanks for the translation, Zaki.
>
>>
>> I'd like to summarise some of what we came up with on GitHub/IRC:
>>
>>   1. A split is necessary to not only make releases easier, but also
>>      development. We have worked on reducing the time required to build
>>      PDL across multiple environments down to a little over 1 hour.
>>
>>      This is still too long when you have perhaps 1.5 hours of tuits 
>> that
>>      day. So the work inevitably gets spread out over weeks.
>>
>>      A split would help decrease this friction.
>>
>>   2. Making `cpanm PDL` always work has always part of the plan.
>>      Improving the PDL devops has helped with that. The plan is to
>>      continue doing that.
>>
>>      But large refactors such as this split can be quite daunting. We
>>      can't be sure we will stick the landing right the first time. But
>>      the job needs to move forward or it will fail via analysis 
>> paralysis
>>      even before it has begun.
>>
>>   3. Ed and I have been thinking about releasing a more agile, friendly
>>      fork of PDL under the PDLA namespace (for PDL Agile). The
>>      repositories will continue to live under the PDLPorters GitHub
>>      organisation.
>>
>>      We will start by applying the split. This will be followed by
>>      improving code coverage, fixes to the 64-bit indexing, formalising
>>      the badvalue semantics for more functions, and bug-fixes.
>>
>>      We plan on making sure that libraries such as PDL-Stats, 
>> PDL-IO-CSV,
>>      etc. remain compatible with this library. I believe there is a way
>>      to do this without making changes to the original code (via a 
>> subref
>>      in @INC).
>>     4. The modules that come from the split will each be improved so 
>> that
>>      they are easy to install on their own. We already have plans to
>>      write Alien::Base modules for all of them.
>>
>>   5. In parallel with this, we will begin reaching out to distribution
>>      packagers. PDL has not been updated on many of them (some of which
>>      are on 2.4.x). This is already on the wishlist at 
>> <https://github.com/PDLPorters/pdl/issues/139>.
>>
>>   6. The current PDL distribution will remain as it is. Bugfixes will
>>      continue on PDL and they will be backported from PDLA. This 
>> approach
>>      has worked well for IPython/Jupyter (which underwent a split 
>> earlier
>>      this summer)[^jupyter-split]. Back porting fixes was a large part
>>      of what they had to go through.
>>
>>   7. Eventually, after we are sure that PDLA has maintained
>>      compatibility with PDL, the changes of PDLA will replace the
>>      current PDL repository.
>>
>> Finally, I also have some ideas for PDL3 that I will post in about a
>> month's time. One of the top priorities on the feature list of PDL3's C
>> API needs to be the ability to do optmisations such as loop fusion. I
>> need to ponder on how to combine this with the Moo-like metaprogramming
>> that we envision. The Julia developers seem to be working on this, but
>> there are still big unresolved questions on the issue tracker.
>>
>> By the way, I think it might be better to avoid putting a number in the
>> name of this next major version of PDL. It's a personal opinion that
>> stems from marketing issues that are similar to what happened with
>> Osborne 1 <https://en.wikipedia.org/wiki/Osborne_effect> and somewhat
>> with Perl 6. This isn't a strongly held opinion, but I feel that it is
>> worth bringing up.
>
> The PDL3 moniker is just a way to identify the "new architecture/api 
> work"
> from the existing PDL-2.x engine for reference to previous 
> discussions.  I
> agree that putting numbers in module names is not good.
>
> --Chris
>


------------------------------------------------------------------------------
_______________________________________________
pdl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pdl-devel

Re: [Pdl-devel] Faster PDL Development Cycle---But How?

Reply via email to