Re: mir-stat

2020-10-30 Thread jmh530 via Digitalmars-d-announce

On Friday, 30 October 2020 at 10:12:58 UTC, Kagamin wrote:

On Tuesday, 13 October 2020 at 10:30:41 UTC, jmh530 wrote:
The difference is that MIT says you can use it without 
restriction, including a few things, while Boost says you can 
do some things. I only meant that MIT license was more 
permissive in that if there are other things you want to do 
with it that are not listed on Boost (I don't know what that 
would be), then MIT would allow it.


Just make sure you don't grant exclusive rights :)


Ilya ended up going with the Apache license.

https://github.com/libmir/mir-algorithm/blob/master/LICENSE


Re: mir-stat

2020-10-30 Thread Kagamin via Digitalmars-d-announce

On Tuesday, 13 October 2020 at 10:30:41 UTC, jmh530 wrote:
The difference is that MIT says you can use it without 
restriction, including a few things, while Boost says you can 
do some things. I only meant that MIT license was more 
permissive in that if there are other things you want to do 
with it that are not listed on Boost (I don't know what that 
would be), then MIT would allow it.


Just make sure you don't grant exclusive rights :)


Re: mir-stat

2020-10-13 Thread jmh530 via Digitalmars-d-announce

On Tuesday, 13 October 2020 at 07:02:26 UTC, aberba wrote:

On Monday, 12 October 2020 at 00:43:51 UTC, jmh530 wrote:

On Sunday, 11 October 2020 at 17:35:26 UTC, 9il wrote:

[snip]


I can't speak to the technical differences between the two. My 
understanding is that MIT is more permissive than Boost, 


I make all my stuff Boost so that anyone can do whatever they 
want with the code. So I'm hoe its not that permissive.


Boost says:

Permission is hereby granted, free of charge, to any person or 
organization
obtaining a copy of the software and accompanying documentation 
covered by
this license (the "Software") to use, reproduce, display, 
distribute,
execute, and transmit the Software, and to prepare derivative 
works of the
Software, and to permit third-parties to whom the Software is 
furnished to

do so, all subject to the following:

MIT says:

Permission is hereby granted, free of charge, to any person 
obtaining a copy of this software and associated documentation 
files (the "Software"), to deal in the Software without 
restriction, including without limitation the rights to use, 
copy, modify, merge, publish, distribute, sublicense, and/or sell 
copies of the Software, and to permit persons to whom the 
Software is furnished to do so, subject to the following 
conditions:


The difference is that MIT says you can use it without 
restriction, including a few things, while Boost says you can do 
some things. I only meant that MIT license was more permissive in 
that if there are other things you want to do with it that are 
not listed on Boost (I don't know what that would be), then MIT 
would allow it.


Re: mir-stat

2020-10-13 Thread aberba via Digitalmars-d-announce

On Monday, 12 October 2020 at 00:43:51 UTC, jmh530 wrote:

On Sunday, 11 October 2020 at 17:35:26 UTC, 9il wrote:

[snip]


I can't speak to the technical differences between the two. My 
understanding is that MIT is more permissive than Boost, 


I make all my stuff Boost so that anyone can do whatever they 
want with the code. So I'm hoe its not that permissive.


Re: mir-stat

2020-10-11 Thread jmh530 via Digitalmars-d-announce

On Sunday, 11 October 2020 at 17:35:26 UTC, 9il wrote:

[snip]

Maybe we should replace Boost with MIT for most of the Mir 
packages. What do you think?


I can't speak to the technical differences between the two. My 
understanding is that MIT is more permissive than Boost, but MIT 
always requires the user to include a copy notice and Boost has 
an exception.


Anyway, it looks like the dstats/distrib.d file [1] is based on 
MathExtra [2] that is based on Cephes [3]. The dstat file looks 
like it has a 3-part BSD license, while MathExtra is MIT 
licensed. Cephes seems to be copyrighted.





[1] 
https://github.com/DlangScience/dstats/blob/master/source/dstats/distrib.d

[2] http://www.dsource.org/projects/mathextra
[3] https://www.netlib.org/cephes/


Re: mir-stat

2020-10-11 Thread 9il via Digitalmars-d-announce

On Sunday, 11 October 2020 at 17:10:19 UTC, jmh530 wrote:

On Sunday, 11 October 2020 at 10:14:04 UTC, tastyminerals wrote:

On Thursday, 8 October 2020 at 16:40:01 UTC, 9il wrote:
It is a pleasure to announce the Dlang Statistical Package by 
John Michael Hall.


[...]


Awesome! Are there any plans to add functions for inferential 
stats?


Next thing I want to add is histogram (influenced by Boost 
histogram), but I have been a bit busy lately and haven't 
finished it. After histogram, the next step would probably be 
pdfs/cdfs/icdfs, but I was thinking about just borrowing from 
what is in dstats (I'll need to look into the license 
compatibility). With those functions in there, then t-test and 
similar functions would be straightforward.


Maybe we should replace Boost with MIT for most of the Mir 
packages. What do you think?


Re: mir-stat

2020-10-11 Thread jmh530 via Digitalmars-d-announce

On Sunday, 11 October 2020 at 10:14:04 UTC, tastyminerals wrote:

On Thursday, 8 October 2020 at 16:40:01 UTC, 9il wrote:
It is a pleasure to announce the Dlang Statistical Package by 
John Michael Hall.


[...]


Awesome! Are there any plans to add functions for inferential 
stats?


Next thing I want to add is histogram (influenced by Boost 
histogram), but I have been a bit busy lately and haven't 
finished it. After histogram, the next step would probably be 
pdfs/cdfs/icdfs, but I was thinking about just borrowing from 
what is in dstats (I'll need to look into the license 
compatibility). With those functions in there, then t-test and 
similar functions would be straightforward.


Re: mir-stat

2020-10-11 Thread tastyminerals via Digitalmars-d-announce

On Thursday, 8 October 2020 at 16:40:01 UTC, 9il wrote:
It is a pleasure to announce the Dlang Statistical Package by 
John Michael Hall.


[...]


Awesome! Are there any plans to add functions for inferential 
stats?


Re: mir-stat

2020-10-08 Thread James Blachly via Digitalmars-d-announce

On 10/8/20 12:40 PM, 9il wrote:
It is a pleasure to announce the Dlang Statistical Package by John 
Michael Hall.


API
http://mir-stat.libmir.org/

GitHub
http://github.com/libmir/mir-stat

DUB
https://code.dlang.org/packages/mir-stat

The initial release provides descriptive statistics and algorithms for 
transforming data that are useful in statistical applications.


The very basic stuff like `gmean` [1] is located in the mir-algorithm 
package, it will be downloaded automatically.


The generation of random numbers of various distributions is provided by 
mir-random package [2].


Outstanding work by all involved. Thank you for driving this (and all of 
mir) forward. We have already found use in our computational biology lab.




Re: mir-stat

2020-10-08 Thread Andre Pany via Digitalmars-d-announce

On Thursday, 8 October 2020 at 18:17:30 UTC, jmh530 wrote:

On Thursday, 8 October 2020 at 17:53:53 UTC, Andre Pany wrote:

[snip]

Thanks for this great piece of software. Does Mir provides 
s.th. similar like Pandas DataFrame, especially the feature to 
give columns a name and marking as inde x columns?


Kind regards
Andre


Magpie [1] was an initial effort as a summer of code project. 
The last commit was September 2019.


There is also some basic support in mir (example at [2]). Ilya 
can speak more about long-term plans for enhancing that.


One limitation in mir is that Slice's only allow for the same 
type throughout. For instance, a Slice!(double*, 1u) is a 
1-dimensional slice of doubles. Data frames in R or Pandas 
DataFrames allow for columns with different types, so for 
instance you can calculate some summary statistic based on some 
category (like color). So to really get the same functionality, 
you need to support slices with heterogeneous types.


[1] https://github.com/Kriyszig/magpie
[2] 
https://github.com/libmir/mir-algorithm/blob/f30ccd9f7abc63166c9179e04b2817bf656764bd/source/mir/ndslice/allocation.d#L330


Thanks for these info. Magpie looks huge and really useful. I 
will give it a try.


I am also highly interested in the long term plans of Mir, as you 
explained the current limitations. Still in my scenario it is 
always the same type. A 2d array of doubles, read from parquet 
files, transformed and written into a new parquet file.


Kind regards
Andre



Re: mir-stat

2020-10-08 Thread jmh530 via Digitalmars-d-announce

On Thursday, 8 October 2020 at 17:53:53 UTC, Andre Pany wrote:

[snip]

Thanks for this great piece of software. Does Mir provides 
s.th. similar like Pandas DataFrame, especially the feature to 
give columns a name and marking as inde x columns?


Kind regards
Andre


Magpie [1] was an initial effort as a summer of code project. The 
last commit was September 2019.


There is also some basic support in mir (example at [2]). Ilya 
can speak more about long-term plans for enhancing that.


One limitation in mir is that Slice's only allow for the same 
type throughout. For instance, a Slice!(double*, 1u) is a 
1-dimensional slice of doubles. Data frames in R or Pandas 
DataFrames allow for columns with different types, so for 
instance you can calculate some summary statistic based on some 
category (like color). So to really get the same functionality, 
you need to support slices with heterogeneous types.


[1] https://github.com/Kriyszig/magpie
[2] 
https://github.com/libmir/mir-algorithm/blob/f30ccd9f7abc63166c9179e04b2817bf656764bd/source/mir/ndslice/allocation.d#L330


Re: mir-stat

2020-10-08 Thread Andre Pany via Digitalmars-d-announce

On Thursday, 8 October 2020 at 16:40:01 UTC, 9il wrote:
It is a pleasure to announce the Dlang Statistical Package by 
John Michael Hall.


[...]


Thanks for this great piece of software. Does Mir provides s.th. 
similar like Pandas DataFrame, especially the feature to give 
columns a name and marking as inde x columns?


Kind regards
Andre