Re: On the Removal of src:tensorflow

2019-09-05 Thread Sam Hartman
> "Mo" == Mo Zhou  writes:

Mo> Hi Steffen, I'm wondering who will read it. In the past I
Mo> broadcasted some bits I learned to the public mailing lists and
Mo> people responded, but nobody had ever asked me any detail about
Mo> anything related.

I think a lot of us have read this and filed it away.
Right now, I think that there are a few people in Debian who have been
following the issues closely who may be around the same level you are.

And there are a lot of us who have read this, filed it away, and six
months to yearsfrom now will be reading it again and having questions
when we run across something in the deep learning space.



Re: On the Removal of src:tensorflow [and 1 more messages]

2019-09-05 Thread Mo Zhou
On 2019-09-05 11:31, Ian Jackson wrote:
> Mo Zhou writes ("Re: On the Removal of src:tensorflow [and 1 more messages]"):
> 
> ... when you have done this please let me know and I will make a wiki
> page at least referencing your document(s), and maybe summarisig.

Look forward to collaborating with you and porting/linking the WIP
document to debian wiki.

> If you could arrange to mirror your article on some Debian server that
> would avoid possible future linkrot (although if you plan for your own
> article to have a good longterm stable url then maybe that's not
> needed).

Take it easy. I'm a sane upstream. The source will be available
on salsa, released under CC-BY-SA-4.0 license, and the compiled
format (maybe pdf) will be uploaded to people.d.o .



Re: On the Removal of src:tensorflow

2019-09-05 Thread Mo Zhou
On 2019-09-05 10:44, Jonathan Carter wrote:
> On 2019/09/05 05:55, Yao Wei wrote:
> I filed it and intend to keep it open for the foreseeable future. Based
> on Mo's original post in this thread, I decided that it will be better
> to look for alternatives to DeepSpeech (especially since all my intended
> use cases rely on very limited dictionaries).

The horrible fact is that one cannot find any decent alternative to
deep learning, especially on the problems impossible for a
non-intelligent
algorithm to solve. There are countless examples about things that only
deep learning could solve (traditional machine learning could solve a
portion of them but generally deep learning does the best).

For ASR (Automatic Speech Recognition) you will find nothing except for
deep learning among the best implementations. Traditional models
for ASR, such as HMM (hidden markov chain) are just far less accurate
compared to deep learning, as reported by papers.

One day ML-Policy will eventually show its sanity.

> Things change though and the Tensorflow project might mature in a few
> years and the project might have better methods available for building it.

Tensorflow 1.X is a long-term support release and it's basically mature
enough. It will be maintained for a while even after the 2.X release.
An energetic developer could work on the abandoned cmake build and
produce packages enough for supporting applications such as deepspeech.

> I think a link to Mo's mail is sufficient on the DeepSpeech ITP bug
> report (and related ITPs). Less is probably more when it comes to wiki
> pages at this point.

:-)



Re: On the Removal of src:tensorflow [and 1 more messages]

2019-09-05 Thread Ian Jackson
Mo Zhou writes ("Re: On the Removal of src:tensorflow [and 1 more messages]"):
> Thanks for the offering. I'm not eager to write such a wiki page
> because it's not urgent. Plus, currently I have a rather complete
> image on the "Future Debian and Artificial Intelligence" topic
> based on experience. Of course I won't be satisfied by
> only documenting these small points.

OK, fair enough.  How about this

> I've already planned to write a long article on the whole topic
> with a good overview and some important details[1] including
> those mentioned in the previous mails. Please look forward to it.
> After finishing the article, porting it to debian wiki would be
> trivial enough.

... when you have done this please let me know and I will make a wiki
page at least referencing your document(s), and maybe summarisig.

If you could arrange to mirror your article on some Debian server that
would avoid possible future linkrot (although if you plan for your own
article to have a good longterm stable url then maybe that's not
needed).

Regards,
Ian.

-- 
Ian JacksonThese opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Re: On the Removal of src:tensorflow [and 1 more messages]

2019-09-05 Thread Mo Zhou
Hi Ian,

Thanks for the offering. I'm not eager to write such a wiki page
because it's not urgent. Plus, currently I have a rather complete
image on the "Future Debian and Artificial Intelligence" topic
based on experience. Of course I won't be satisfied by
only documenting these small points.

I've already planned to write a long article on the whole topic
with a good overview and some important details[1] including
those mentioned in the previous mails. Please look forward to it.
After finishing the article, porting it to debian wiki would be
trivial enough.

On the other hand, rushing a debian wiki page covering only a
single small topic, out of git tracking, would result in something
like "write once, burden forever" to me once the URL had been
widely spread. So please wait for my article.

P.S.
I'm concurrently working on 3 non-trivial documentation projects:
1. "Debian and Gentoo's Linear Algebra Libraries" (i.e. BLAS/LAPACK)
2. The Unofficial "ML-Policy"
3. "Future Debian and Artificial Intelligence"
hope this could help me demonstrate how unwilling I am to
maintain a wiki page.

[1] and possibly submit it to lwn

On 2019-09-05 08:23, Ian Jackson wrote:
> Jonas Smedegaard writes ("Re: On the Removal of src:tensorflow"):
>> Please beware that there's a huge difference between being able to
>> stumble upon your notes in a web search - and in the case of a wiki page
>> to add additional notes to it - and then to have an open invitation to
>> ask questions to an expert.
>>
>> I encourage you to help get your valuable information onto our wiki,
>> regardless of the lack of feedback you have received on it.
> 
> Seconded.  However, I don't think this...
> 
>> Our wiki does not use mediawiki but an older(?) markup called Creole.
> 
> Mo Zhou writes ("Re: On the Removal of src:tensorflow"):
>> I dislike the mediawiki markup. [...]
> 
> ... is likely to be an answer to that.
> 
> Mo: if you write up your notes in some plain text format I will put
> them on the wiki for you.  You should probably do that by replying to
> this thread because of the principle of doing things in public.
> Please be sure to CC me on the email.  NB that do *not* intend to take
> on the role of ongoing document editor and won't be incorporating
> comments made in response to your mail.
> 
> Thanks,
> Ian.



Re: On the Removal of src:tensorflow

2019-09-05 Thread Jonathan Carter
On 2019/09/05 05:55, Yao Wei wrote:
> On Wed, Sep 04, 2019 at 07:38:11PM -0700, Mo Zhou wrote:
>> I'm wondering who will read it. In the past I broadcasted some bits
>> I learned to the public mailing lists and people responded, but
>> nobody had ever asked me any detail about anything related.
> 
> Recently there's ITP for DeepSpeech (#921519) based on Baidu's research
> and Mozilla Common Voice project, which is depending on TensorFlow:
> 
> https://github.com/mozilla/DeepSpeech

I filed it and intend to keep it open for the foreseeable future. Based
on Mo's original post in this thread, I decided that it will be better
to look for alternatives to DeepSpeech (especially since all my intended
use cases rely on very limited dictionaries).

Things change though and the Tensorflow project might mature in a few
years and the project might have better methods available for building it.

> I think it is a good practice to leave a tombstone in the wiki that what
> you consider may be pitfalls for packaging deep learning programs and
> the problems that are specific to certain ML frameworks.
> 
> If anyone wants to resurrect TensorFlow in Debian, then it is a good
> reference to see what they would like to avoid.

I think a link to Mo's mail is sufficient on the DeepSpeech ITP bug
report (and related ITPs). Less is probably more when it comes to wiki
pages at this point.

-Jonathan

-- 
  ⢀⣴⠾⠻⢶⣦⠀  Jonathan Carter (highvoltage) 
  ⣾⠁⢠⠒⠀⣿⡁  Debian Developer - https://wiki.debian.org/highvoltage
  ⢿⡄⠘⠷⠚⠋   https://debian.org | https://jonathancarter.org
  ⠈⠳⣄  Be Bold. Be brave. Debian has got your back.



Re: On the Removal of src:tensorflow [and 1 more messages]

2019-09-05 Thread Ian Jackson
Jonas Smedegaard writes ("Re: On the Removal of src:tensorflow"):
> Please beware that there's a huge difference between being able to 
> stumble upon your notes in a web search - and in the case of a wiki page 
> to add additional notes to it - and then to have an open invitation to 
> ask questions to an expert.
> 
> I encourage you to help get your valuable information onto our wiki, 
> regardless of the lack of feedback you have received on it.

Seconded.  However, I don't think this...

> Our wiki does not use mediawiki but an older(?) markup called Creole.

Mo Zhou writes ("Re: On the Removal of src:tensorflow"):
> I dislike the mediawiki markup. [...]

... is likely to be an answer to that.

Mo: if you write up your notes in some plain text format I will put
them on the wiki for you.  You should probably do that by replying to
this thread because of the principle of doing things in public.
Please be sure to CC me on the email.  NB that do *not* intend to take
on the role of ongoing document editor and won't be incorporating
comments made in response to your mail.

Thanks,
Ian.



Re: On the Removal of src:tensorflow

2019-09-05 Thread Jonas Smedegaard
Quoting Mo Zhou (2019-09-05 07:20:44)
> > If anyone wants to resurrect TensorFlow in Debian, then it is a good 
> > reference to see what they would like to avoid.
> 
> Anyone interested in this could ask me for hints if in need.

Please beware that there's a huge difference between being able to 
stumble upon your notes in a web search - and in the case of a wiki page 
to add additional notes to it - and then to have an open invitation to 
ask questions to an expert.

I encourage you to help get your valuable information onto our wiki, 
regardless of the lack of feedback you have received on it.


Kind regards, from someone appreciating your work but not having much 
questions to ask,

 - Jonas

P.S.

Our wiki does not use mediawiki but an older(?) markup called Creole.

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private


signature.asc
Description: signature


Re: On the Removal of src:tensorflow

2019-09-04 Thread Mo Zhou
Hi Yao,

On 2019-09-05 03:55, Yao Wei wrote:
> Recently there's ITP for DeepSpeech (#921519) based on Baidu's research
> and Mozilla Common Voice project, which is depending on TensorFlow:
> https://github.com/mozilla/DeepSpeech

ML-Policy is interested in the giant blob (pretrained neural network)
powering deepspeech.

> I think it is a good practice to leave a tombstone in the wiki that what
> you consider may be pitfalls for packaging deep learning programs and
> the problems that are specific to certain ML frameworks.

I'm planning to write an article connecting all these things together:
debian, artificial intelligence, deep learning, deep learning
frameworks,
deep learning applications, SIMDebian, Debian user repository toolkit,
software freedom, nvidia/cuda, amd/hip-rocm, ML-Policy, conda/anaconda,
scientific computing, julia, etc. That would cover nearly all my debian
activities. Some of them may look unrelated, but actually share a single
concentrated motivation.

And the "pitfalls" will naturally be a part of this article.

> If anyone wants to resurrect TensorFlow in Debian, then it is a good
> reference to see what they would like to avoid.

Anyone interested in this could ask me for hints if in need.



Re: On the Removal of src:tensorflow

2019-09-04 Thread Yao Wei
On Wed, Sep 04, 2019 at 07:38:11PM -0700, Mo Zhou wrote:
> I'm wondering who will read it. In the past I broadcasted some bits
> I learned to the public mailing lists and people responded, but
> nobody had ever asked me any detail about anything related.

Recently there's ITP for DeepSpeech (#921519) based on Baidu's research
and Mozilla Common Voice project, which is depending on TensorFlow:

https://github.com/mozilla/DeepSpeech

I think it is a good practice to leave a tombstone in the wiki that what
you consider may be pitfalls for packaging deep learning programs and
the problems that are specific to certain ML frameworks.

If anyone wants to resurrect TensorFlow in Debian, then it is a good
reference to see what they would like to avoid.

Best regards,
Yao Wei


signature.asc
Description: PGP signature


Re: On the Removal of src:tensorflow

2019-09-04 Thread Mo Zhou
Hi Steffen,

I'm wondering who will read it. In the past I broadcasted some bits
I learned to the public mailing lists and people responded, but
nobody had ever asked me any detail about anything related.

I dislike the mediawiki markup. If you think it's important to
document these insights, could you please help me start a template
and submit it to ML-Policy[1] as a PR/issue? We could create a
page on wiki.d.o and point it to salsa (or paste pandoc conversion
result there).

[1] https://salsa.debian.org/lumin/ml-policy/

On 2019-09-04 07:47, Steffen Möller wrote:
> Hi Mo,
> 
> Would you mind creating a wiki.d.o page that collects your insights? Or
> shall I start and you fill in the gaps?
> 
> Best,
> Steffen
> 
> On 26.08.19 04:04, Mo Zhou wrote:
>> Hi -devel,
>>
>>
>> I've just filed an RM(#935769) bug against src:tensorflow and I believe
>> this is the most appropriate choice at this stage. For packages that
>> would easily draw attention from the media, not providing them would be
>> much better than providing something much inferior than the users
>> expected (Recall "difficulty ... DL framework" and "conda ...").
>>
>>
>> A number of packages in our archive have referenced tensorflow:
>>
>>https://codesearch.debian.net/search?q=tensorflow=1
>>
>> Or even called its C API (ffmpeg calls tensorflow for its super
>> resolution filter. ffmpeg package maintainers have disabled the
>> --enable-libtensorflow configure option). At this point, for good wish
>> some contributors may hope to put a little bit more effort to save the
>> package and at least keep its C/C++ interface available. To me avoiding
>> the Bazel build (the only build system for tensorflow) is costly, and
>> the yield isn't worth the cost. The most practical recommendation to
>> tensorflow users is "pip or conda".
>>
>>
>> Deep learning (DL) framework is NOT something too complex to be
>> implemented from scratch by a single person. A fundamental DL framework
>> can be implemented with the following functionalities:
>>   1) data loading, e.g. a CSV reader
>>   2) linear operations, e.g. matrix multiplication, convolution
>>   3) element-wise non-linear functions, e.g. max(x,0), exp(x), ln(x)
>>   4) the computation graph (sort of directed acyclic graph)
>>   5) automatic (of manual) differentiation (computing the gradient)
>>   6) first-order gradient-based optimization (network training)
>>
>> That means tensorflow's complexity doesn't come from the theoritical
>> side, but engineering especially performance optimization. On the other
>> hand, it's easy for the users to find an alternative to tensorflow if
>> they don't heavily rely on some portion of its functionality.
>>
>>
>> Based on the following facts, I believe removing src:tensorflow is the
>> most appropriate choice at the current stage, and I DISCOURAGE any
>> effort trying to save or re-introduce it.
>>
>>   1) TensorFlow's only well-supported build system, i.e. Bazel is
>>  hopeless to enter Debian.
>>   2) Maintaining an alternative build system (cmake, or any self-made
>>  one) could be costly.
>>   3) Even if somebody conquered the build system issue at a cost,
>>  it's only possible to upload the low-performance version to our main
>>  archive (out of SIMD acceleration due to our ISA baseline,
>>  out of CUDA acceleration or OpenCL acceleration).
>>   4) To mitigate the performance issue one could upload a CUDA version
>>  to contrib section, but I promise that dealing with nvidia stuff
>>  once anything went wrong is a painful experience to free distro
>>  developer.
>>   5) To mitigate the performance issue with OpenCL one could also try
>>  AMD's fully open-source ROCm/HIP software stack (AMD's opensource
>>  counterpart to the nvidia CUDA). However the usage of AMD graphics
>>  for machine learning is still not common, and none of the
>>  related software has been packaged yet.
>>
>> With that said, I still encourage people who care about such topic to
>> maintain building block packages (I'm maintaining some of these) for
>> DL frameworks, or some alternative DL frameworks if you see appropriate.
>>



Re: On the Removal of src:tensorflow

2019-09-04 Thread Steffen Möller

Hi Mo,

Would you mind creating a wiki.d.o page that collects your insights? Or
shall I start and you fill in the gaps?

Best,
Steffen

On 26.08.19 04:04, Mo Zhou wrote:

Hi -devel,


I've just filed an RM(#935769) bug against src:tensorflow and I believe
this is the most appropriate choice at this stage. For packages that
would easily draw attention from the media, not providing them would be
much better than providing something much inferior than the users
expected (Recall "difficulty ... DL framework" and "conda ...").


A number of packages in our archive have referenced tensorflow:

   https://codesearch.debian.net/search?q=tensorflow=1

Or even called its C API (ffmpeg calls tensorflow for its super
resolution filter. ffmpeg package maintainers have disabled the
--enable-libtensorflow configure option). At this point, for good wish
some contributors may hope to put a little bit more effort to save the
package and at least keep its C/C++ interface available. To me avoiding
the Bazel build (the only build system for tensorflow) is costly, and
the yield isn't worth the cost. The most practical recommendation to
tensorflow users is "pip or conda".


Deep learning (DL) framework is NOT something too complex to be
implemented from scratch by a single person. A fundamental DL framework
can be implemented with the following functionalities:
  1) data loading, e.g. a CSV reader
  2) linear operations, e.g. matrix multiplication, convolution
  3) element-wise non-linear functions, e.g. max(x,0), exp(x), ln(x)
  4) the computation graph (sort of directed acyclic graph)
  5) automatic (of manual) differentiation (computing the gradient)
  6) first-order gradient-based optimization (network training)

That means tensorflow's complexity doesn't come from the theoritical
side, but engineering especially performance optimization. On the other
hand, it's easy for the users to find an alternative to tensorflow if
they don't heavily rely on some portion of its functionality.


Based on the following facts, I believe removing src:tensorflow is the
most appropriate choice at the current stage, and I DISCOURAGE any
effort trying to save or re-introduce it.

  1) TensorFlow's only well-supported build system, i.e. Bazel is
 hopeless to enter Debian.
  2) Maintaining an alternative build system (cmake, or any self-made
 one) could be costly.
  3) Even if somebody conquered the build system issue at a cost,
 it's only possible to upload the low-performance version to our main
 archive (out of SIMD acceleration due to our ISA baseline,
 out of CUDA acceleration or OpenCL acceleration).
  4) To mitigate the performance issue one could upload a CUDA version
 to contrib section, but I promise that dealing with nvidia stuff
 once anything went wrong is a painful experience to free distro
 developer.
  5) To mitigate the performance issue with OpenCL one could also try
 AMD's fully open-source ROCm/HIP software stack (AMD's opensource
 counterpart to the nvidia CUDA). However the usage of AMD graphics
 for machine learning is still not common, and none of the
 related software has been packaged yet.

With that said, I still encourage people who care about such topic to
maintain building block packages (I'm maintaining some of these) for
DL frameworks, or some alternative DL frameworks if you see appropriate.





Re: On the Removal of src:tensorflow

2019-08-26 Thread Sam Hartman
> "Mo" == Mo Zhou  writes:

Mo> Hi -devel, I've just filed an RM(#935769) bug against
Mo> src:tensorflow and I believe this is the most appropriate choice
Mo> at this stage. For packages that would easily draw attention
Mo> from the media, not providing them would be much better than
Mo> providing something much inferior than the users expected
Mo> (Recall "difficulty ... DL framework" and "conda ...").


I'm speaking as an individual here, not as the DPL.

I actually think it's valuable to provide Debian packages even if the
performance is not what users would want.
Provided that people are working on the packages and improving them.
Doing so makes it easier to free things up in the future, makes it
easier to understand what we don't have, etc.

I here that you no longer find it valuable to do this work.  And if
there aren't maintainers who are interested in working to improve the
situation, I definitely think it is best to remove the package.

I think the part of your message I'm disagreeing with is the desire to
discourage people from reintroducing the package in the future.

I think you've done a good job of documenting the obstacles.  I think
anyone who wants to reintroduce the package should consider the
obstacles you've documented.

But either if because they have work-arounds for those obstacles or
because they see it as worth their time without  work arounds, I think
that's OK.

Although, I'll admit that they're probably going to have to do
somethingf about a build system.  We don't have a lot of use for
packages that don't build:-)
I think what I'm trying to say is that it's great to step away from work
when you don't see value  It's great to document problems others would
face in the future.
But the bar for telling others not to do things they find valuable is
probably a lot higher.

As always thanks for all your work and especially for writing up your
results!

--Sam



On the Removal of src:tensorflow

2019-08-25 Thread Mo Zhou
Hi -devel,


I've just filed an RM(#935769) bug against src:tensorflow and I believe
this is the most appropriate choice at this stage. For packages that
would easily draw attention from the media, not providing them would be
much better than providing something much inferior than the users
expected (Recall "difficulty ... DL framework" and "conda ...").


A number of packages in our archive have referenced tensorflow:

  https://codesearch.debian.net/search?q=tensorflow=1

Or even called its C API (ffmpeg calls tensorflow for its super
resolution filter. ffmpeg package maintainers have disabled the
--enable-libtensorflow configure option). At this point, for good wish
some contributors may hope to put a little bit more effort to save the
package and at least keep its C/C++ interface available. To me avoiding
the Bazel build (the only build system for tensorflow) is costly, and
the yield isn't worth the cost. The most practical recommendation to
tensorflow users is "pip or conda".


Deep learning (DL) framework is NOT something too complex to be
implemented from scratch by a single person. A fundamental DL framework
can be implemented with the following functionalities:
 1) data loading, e.g. a CSV reader
 2) linear operations, e.g. matrix multiplication, convolution
 3) element-wise non-linear functions, e.g. max(x,0), exp(x), ln(x)
 4) the computation graph (sort of directed acyclic graph)
 5) automatic (of manual) differentiation (computing the gradient)
 6) first-order gradient-based optimization (network training)

That means tensorflow's complexity doesn't come from the theoritical
side, but engineering especially performance optimization. On the other
hand, it's easy for the users to find an alternative to tensorflow if
they don't heavily rely on some portion of its functionality.


Based on the following facts, I believe removing src:tensorflow is the
most appropriate choice at the current stage, and I DISCOURAGE any
effort trying to save or re-introduce it.

 1) TensorFlow's only well-supported build system, i.e. Bazel is
hopeless to enter Debian.
 2) Maintaining an alternative build system (cmake, or any self-made
one) could be costly.
 3) Even if somebody conquered the build system issue at a cost,
it's only possible to upload the low-performance version to our main
archive (out of SIMD acceleration due to our ISA baseline,
out of CUDA acceleration or OpenCL acceleration).
 4) To mitigate the performance issue one could upload a CUDA version
to contrib section, but I promise that dealing with nvidia stuff
once anything went wrong is a painful experience to free distro
developer.
 5) To mitigate the performance issue with OpenCL one could also try
AMD's fully open-source ROCm/HIP software stack (AMD's opensource
counterpart to the nvidia CUDA). However the usage of AMD graphics
for machine learning is still not common, and none of the
related software has been packaged yet.

With that said, I still encourage people who care about such topic to
maintain building block packages (I'm maintaining some of these) for
DL frameworks, or some alternative DL frameworks if you see appropriate.