Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-09 Thread Páll Haraldsson
On Friday, May 1, 2015 at 5:23:40 PM UTC, Steven G. Johnson wrote:
>
> On Friday, May 1, 2015 at 1:12:00 PM UTC-4, Steven Sagaert wrote: 
>>
>> That wasn't what I was saying. I like the philosophy behind julia. But in 
>> practice (as of now) even in julia you still have to code in a certain 
>> style if you want very good performance and that's no different than in any 
>> other language.
>>
>
> The goal of Julia is not to be a language in which it is *impossible* to 
> write slow code, or a language in which all programming styles are equally 
> fast.   The goal (or at least, one of the goals) is to be an expressive, 
> high-level dynamic language, in which it is also *possible* to write 
> performance-critical inner-loop code.
>

*Summary*

Thanks (all) for answering. I agree that *possible* to write fast code is a 
goal. I believe that has been achieved. Nobody commented much on my list of 
concerns..

Yes, of course *impossible* to write slow code is a very high bar.. I just 
thought Python - an interpreted language - wasn't a high bar :) I'm just 
using that as a comparison. I would like (newbie) Julia code not be beaten 
by (core language) Python. Or not at least by much (a constant factor). Has 
that been achieved? I noticed the "yes/no" answer on "Any". Global no 
longer a problem? Yes, gets you slow code but compared to Python? 
Tuples/Dict now as fast? [I just noticed Named tuples thread.]

Then there are of course, say, Python libraries that are "faster" to 
non-exciting Julia ones.. My hope is through PyCall you can use them all (I 
understand that to be the case) - without speed penalty. We may still have 
the two/N-language problem for a while, for functionality reasons but not 
speed-reasons.. The dual Julia/Python is much preferred "problem" I think 
to Julia/C or Python/C.. and gets you all the "batteries included" you 
would want (speaking as a "non-math" user).

Great to see that strings are being worked on, I never wanted this thread 
to be just about "one thing". I can now see how RefCounting in Python helps 
strings.. I'm also looking into how to beat Python there..


 
>
That *is* different from other high-level languages, in which it is 
> typically *not* possible to write performance-critical inner-loop code 
> without dropping down to a lower-level language (C, Fortran, Cython...).   
> If you are coding exclusively in Python or R, and there isn't an optimized 
> function appropriate for the innermost loops of your task at hand, you are 
> out of luck.
>


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-04 Thread Tamas Papp
On Mon, May 04 2015, Scott Jones  wrote:

>> On May 4, 2015, at 3:21 AM, Tamas Papp  wrote:
>> 
>> I think you misunderstand: IOBuffer is suggested not for mutable string
>> operations in general, but only for efficient concatenation of many
>> strings.
>> 
>> Best,
>> 
>> Tamas
>
> I don’t think that I misunderstood - it’s that using IOBuffer is the only 
> solution that has been given here… and it doesn’t handle what I need to do 
> efficiently...
> If you have a better solution, please let me know…

1. Can you share the benchmarks (and simplified, self-contained code)
for your problem using IOBuffer? I have always found it very fast, but
maybe what you are working on is different.

2. Do you have a specific algorithm in mind that would be more
efficient?

Best,

Tamas


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-04 Thread Scott Jones

> On May 4, 2015, at 3:21 AM, Tamas Papp  wrote:
> 
> I think you misunderstand: IOBuffer is suggested not for mutable string
> operations in general, but only for efficient concatenation of many
> strings.
> 
> Best,
> 
> Tamas

I don’t think that I misunderstood - it’s that using IOBuffer is the only 
solution that has been given here… and it doesn’t handle what I need to do 
efficiently...
If you have a better solution, please let me know…

Scott

> On Mon, May 04 2015, Scott Jones  > wrote:
> 
>> I wasn't trying to say that it was specific to strings, I was saying that
>> it is not specific to I/O, which the name would seem to indicate...
>> and it keeps getting brought up as something that should be used for basic
>> mutable string operations.
>> 
>> On Sunday, May 3, 2015 at 3:20:43 PM UTC-4, Tamas Papp wrote:
>>> 
>>> consider
>>> 
>>> let io = IOBuffer()
>>>  write(io,rand(10))
>>>  takebuf_array(io)
>>> end
>>> 
>>> IOBuffer() is not specific to strings at all.
>>> 
>>> Best,
>>> 
>>> Tamas
>>> 
>>> On Sun, May 03 2015, Scott Jones http://gmail.com/> 
>>> >
>>> wrote:
>>> 
 Because you can have binary strings and text strings... there is even a
 special literal for binary strings...
 b"\xffThis is a binary\x01\string"
 "This is a \u307 text string"
 
 Calling it an IOBuffer makes it sound like it is specific to I/O, not
>>> just
 strings (binary or text) that you might never do I/O on...
 
 On Sunday, May 3, 2015 at 2:43:14 PM UTC-4, Kristoffer Carlsson wrote:
> 
> Why should it be called StringBuffer when another common use of it is
>>> to
> write raw binary data?



Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-04 Thread Tamas Papp
I think you misunderstand: IOBuffer is suggested not for mutable string
operations in general, but only for efficient concatenation of many
strings.

Best,

Tamas

On Mon, May 04 2015, Scott Jones  wrote:

> I wasn't trying to say that it was specific to strings, I was saying that
> it is not specific to I/O, which the name would seem to indicate...
> and it keeps getting brought up as something that should be used for basic
> mutable string operations.
>
> On Sunday, May 3, 2015 at 3:20:43 PM UTC-4, Tamas Papp wrote:
>>
>> consider
>>
>> let io = IOBuffer()
>>   write(io,rand(10))
>>   takebuf_array(io)
>> end
>>
>> IOBuffer() is not specific to strings at all.
>>
>> Best,
>>
>> Tamas
>>
>> On Sun, May 03 2015, Scott Jones >
>> wrote:
>>
>> > Because you can have binary strings and text strings... there is even a
>> > special literal for binary strings...
>> > b"\xffThis is a binary\x01\string"
>> > "This is a \u307 text string"
>> >
>> > Calling it an IOBuffer makes it sound like it is specific to I/O, not
>> just
>> > strings (binary or text) that you might never do I/O on...
>> >
>> > On Sunday, May 3, 2015 at 2:43:14 PM UTC-4, Kristoffer Carlsson wrote:
>> >>
>> >> Why should it be called StringBuffer when another common use of it is
>> to
>> >> write raw binary data?
>>


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-04 Thread Scott Jones


On Sunday, May 3, 2015 at 6:10:00 PM UTC-4, Kevin Squire wrote:
>
> One thing I was confused about when I first started using Julia was that 
> things that are done with strings in other languages are often done 
> directly with IO objects in Julia.
>
> For example, consider that, in Python, most classes define `__str__()` and 
> `__repr__()`, which create string representations of objects of this class 
> (the first more meant for human consumption, the second for parsing 
> (usually)).  
>
> In Julia, the implicit assumption is that most strings are meant for 
> output in some way, so why not skip the extra memory allocation and write 
> the string representation directly to output.  For this, types define 
> `show(io::IO, x::MyType)`.  If you really want to manipulate such strings, 
> you can (as pointed out in this thread) go through an IOBuffer object 
> first.  (There is also `repr(x::SomeType)`, but it's not emphasized as 
> much.)
>

Problem is, with what I'm doing, the strings are almost never written to 
output... they are analyzed, modified, stored and retrieved from a 
database... and you want all the normal
string operations... you might be doing regex search/replace, for 
example...  and for performance reasons, you don't want to be converting to 
an immutable string all the time.

This was a design decision made early on.  I personally found (and still 
> find) it somewhat awkward at times, but for many things, it works fine, and 
> (seemingly) it lets most string output allocate less memory by default.
>
> Now, it certainly is the case that mutable strings may be very useful in 
> some contexts.  The BioSeq.jl package implements mutable DNA and protein 
> sequences, which are very useful there, and would be represented by mutable 
> strings in many other languages.  The best way to test that would probably 
> be to create a package (say, MutableStrings.jl), and define useful types 
> and functions there.
>

There are a few things I'd like to add to Julia wrt strings, validated 
strings (right now, it is a bit of a mishmash as to whether or not convert 
functions will accept invalid Unicode data),
and mutable strings...  Somebody already did create a MutableStrings.jl, 
however it is broken, it doesn't look like it has been updated in over a 
year, and is only for ASCII and UTF-8, it doesn't have UTF-16 or UTF-32 
mutable strings...
(and I also want mutable 8-bit (ANSI Latin 1) strings  and UCS-2 strings 
(i.e. UTF-16 with no surrogates) [that is so that it would be a 
DirectIndexString, to get O(1) instead of O(n) for some operations].)

Cheers,
>Kevin
>
 


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-03 Thread Scott Jones
I wasn't trying to say that it was specific to strings, I was saying that 
it is not specific to I/O, which the name would seem to indicate...
and it keeps getting brought up as something that should be used for basic 
mutable string operations.

On Sunday, May 3, 2015 at 3:20:43 PM UTC-4, Tamas Papp wrote:
>
> consider 
>
> let io = IOBuffer() 
>   write(io,rand(10)) 
>   takebuf_array(io) 
> end 
>
> IOBuffer() is not specific to strings at all. 
>
> Best, 
>
> Tamas 
>
> On Sun, May 03 2015, Scott Jones > 
> wrote: 
>
> > Because you can have binary strings and text strings... there is even a 
> > special literal for binary strings... 
> > b"\xffThis is a binary\x01\string" 
> > "This is a \u307 text string" 
> > 
> > Calling it an IOBuffer makes it sound like it is specific to I/O, not 
> just 
> > strings (binary or text) that you might never do I/O on... 
> > 
> > On Sunday, May 3, 2015 at 2:43:14 PM UTC-4, Kristoffer Carlsson wrote: 
> >> 
> >> Why should it be called StringBuffer when another common use of it is 
> to 
> >> write raw binary data? 
>


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-03 Thread Kevin Squire
One thing I was confused about when I first started using Julia was that
things that are done with strings in other languages are often done
directly with IO objects in Julia.

For example, consider that, in Python, most classes define `__str__()` and
`__repr__()`, which create string representations of objects of this class
(the first more meant for human consumption, the second for parsing
(usually)).

In Julia, the implicit assumption is that most strings are meant for output
in some way, so why not skip the extra memory allocation and write the
string representation directly to output.  For this, types define
`show(io::IO, x::MyType)`.  If you really want to manipulate such strings,
you can (as pointed out in this thread) go through an IOBuffer object
first.  (There is also `repr(x::SomeType)`, but it's not emphasized as
much.)

This was a design decision made early on.  I personally found (and still
find) it somewhat awkward at times, but for many things, it works fine, and
(seemingly) it lets most string output allocate less memory by default.

Now, it certainly is the case that mutable strings may be very useful in
some contexts.  The BioSeq.jl package implements mutable DNA and protein
sequences, which are very useful there, and would be represented by mutable
strings in many other languages.  The best way to test that would probably
be to create a package (say, MutableStrings.jl), and define useful types
and functions there.

Cheers,
   Kevin



On Sun, May 3, 2015 at 12:20 PM, Tamas Papp  wrote:

> consider
>
> let io = IOBuffer()
>   write(io,rand(10))
>   takebuf_array(io)
> end
>
> IOBuffer() is not specific to strings at all.
>
> Best,
>
> Tamas
>
> On Sun, May 03 2015, Scott Jones  wrote:
>
> > Because you can have binary strings and text strings... there is even a
> > special literal for binary strings...
> > b"\xffThis is a binary\x01\string"
> > "This is a \u307 text string"
> >
> > Calling it an IOBuffer makes it sound like it is specific to I/O, not
> just
> > strings (binary or text) that you might never do I/O on...
> >
> > On Sunday, May 3, 2015 at 2:43:14 PM UTC-4, Kristoffer Carlsson wrote:
> >>
> >> Why should it be called StringBuffer when another common use of it is to
> >> write raw binary data?
>


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-03 Thread Tamas Papp
consider

let io = IOBuffer()
  write(io,rand(10))
  takebuf_array(io)
end

IOBuffer() is not specific to strings at all.

Best,

Tamas

On Sun, May 03 2015, Scott Jones  wrote:

> Because you can have binary strings and text strings... there is even a
> special literal for binary strings...
> b"\xffThis is a binary\x01\string"
> "This is a \u307 text string"
>
> Calling it an IOBuffer makes it sound like it is specific to I/O, not just
> strings (binary or text) that you might never do I/O on...
>
> On Sunday, May 3, 2015 at 2:43:14 PM UTC-4, Kristoffer Carlsson wrote:
>>
>> Why should it be called StringBuffer when another common use of it is to
>> write raw binary data?


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-03 Thread Scott Jones
Because you can have binary strings and text strings... there is even a 
special literal for binary strings...
b"\xffThis is a binary\x01\string"
"This is a \u307 text string"

Calling it an IOBuffer makes it sound like it is specific to I/O, not just 
strings (binary or text) that you might never do I/O on...

On Sunday, May 3, 2015 at 2:43:14 PM UTC-4, Kristoffer Carlsson wrote:
>
> Why should it be called StringBuffer when another common use of it is to 
> write raw binary data?



Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-03 Thread Kristoffer Carlsson
Why should it be called StringBuffer when another common use of it is to write 
raw binary data?

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-03 Thread Scott Jones
I should be clear, I didn't mean that all strings should be immutable, but 
rather that I also want to have mutable strings available... There is a package 
for that, but 1) I think it's incomplete (I may need to contribute to it), and 
2) I think that they do belong in the base language...
CLU had both, which was very nice...
For many things, IOBuffer is exactly the right way of doing things (the name is 
misleading though... Maybe it should have been StringBuffer...), but there are 
use cases where you are constantly modifying the string while performing other 
string operations on it...

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-03 Thread Steven Sagaert
You really should ask the language designers about this for a definite 
answer but (one of the ) the reason(s) strings are immutable in julia (and 
in Java & others) is that it makes them good keys for Dicts.

On Saturday, May 2, 2015 at 7:16:24 PM UTC+2, Jameson wrote:
>
> IOBuffer does not inherit from string, nor does it implement any of the 
> methods expected of a mutable string (length, endof, insert! / splice! / 
> append!). If you want strings that support all of those operations, then 
> you will need something different from an IOBuffer. If you just wanted a 
> fast string builder, then IOBuffer is the right abstraction (ending with a 
> call to `takebuf_string!`). This dichotomy helps to give a clear 
> distinction in the code between the construction phase and usage phase.
>
> On Sat, May 2, 2015 at 12:49 PM Páll Haraldsson  > wrote:
>
>> 2015-05-01 16:42 GMT+00:00 Steven G. Johnson > >:
>>
>>>
>>> In Julia, Ruby, Java, Go, and many other languages, concatenation 
>>> allocates a new string and hence building a string by repeated 
>>> concatenation is O(n^2).   That doesn't mean that those other languages 
>>> "lose" on string processing to Python, it just means that you have to do 
>>> things slightly differently (e.g. write to an IOBuffer in Julia).
>>>
>>> You can't always expect the *same code* (translated as literally as 
>>> possible) to be the optimal approach in different languages, and it is 
>>> inflammatory to compare languages according to this standard.
>>>
>>> A fairer question is whether it is *much harder* to get good performance 
>>> in one language vs. another for a certain task.   There will certainly be 
>>> tasks where Python is still superior in this sense simply because there are 
>>> many cases where Python calls highly tuned C libraries for operations that 
>>> have not been as optimized in Julia.  Julia will tend to shine the further 
>>> you stray from "built-in" operations in your performance-critical code.
>>>
>>
>> What I would like to know is do you need to make your own string type to 
>> make Julia as fast (by a constant factor) to say Python. In another answer 
>> IOBuffer was said to be not good enough.
>>
>>
>> -- 
>> Palli.
>>
>

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-02 Thread Jameson Nash
IOBuffer does not inherit from string, nor does it implement any of the
methods expected of a mutable string (length, endof, insert! / splice! /
append!). If you want strings that support all of those operations, then
you will need something different from an IOBuffer. If you just wanted a
fast string builder, then IOBuffer is the right abstraction (ending with a
call to `takebuf_string!`). This dichotomy helps to give a clear
distinction in the code between the construction phase and usage phase.

On Sat, May 2, 2015 at 12:49 PM Páll Haraldsson 
wrote:

> 2015-05-01 16:42 GMT+00:00 Steven G. Johnson :
>
>>
>> In Julia, Ruby, Java, Go, and many other languages, concatenation
>> allocates a new string and hence building a string by repeated
>> concatenation is O(n^2).   That doesn't mean that those other languages
>> "lose" on string processing to Python, it just means that you have to do
>> things slightly differently (e.g. write to an IOBuffer in Julia).
>>
>> You can't always expect the *same code* (translated as literally as
>> possible) to be the optimal approach in different languages, and it is
>> inflammatory to compare languages according to this standard.
>>
>> A fairer question is whether it is *much harder* to get good performance
>> in one language vs. another for a certain task.   There will certainly be
>> tasks where Python is still superior in this sense simply because there are
>> many cases where Python calls highly tuned C libraries for operations that
>> have not been as optimized in Julia.  Julia will tend to shine the further
>> you stray from "built-in" operations in your performance-critical code.
>>
>
> What I would like to know is do you need to make your own string type to
> make Julia as fast (by a constant factor) to say Python. In another answer
> IOBuffer was said to be not good enough.
>
>
> --
> Palli.
>


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-02 Thread Páll Haraldsson
2015-05-01 16:42 GMT+00:00 Steven G. Johnson :

>
> In Julia, Ruby, Java, Go, and many other languages, concatenation
> allocates a new string and hence building a string by repeated
> concatenation is O(n^2).   That doesn't mean that those other languages
> "lose" on string processing to Python, it just means that you have to do
> things slightly differently (e.g. write to an IOBuffer in Julia).
>
> You can't always expect the *same code* (translated as literally as
> possible) to be the optimal approach in different languages, and it is
> inflammatory to compare languages according to this standard.
>
> A fairer question is whether it is *much harder* to get good performance
> in one language vs. another for a certain task.   There will certainly be
> tasks where Python is still superior in this sense simply because there are
> many cases where Python calls highly tuned C libraries for operations that
> have not been as optimized in Julia.  Julia will tend to shine the further
> you stray from "built-in" operations in your performance-critical code.
>

What I would like to know is do you need to make your own string type to
make Julia as fast (by a constant factor) to say Python. In another answer
IOBuffer was said to be not good enough.

-- 
Palli.


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread elextr

>
>  If you are coding exclusively in Python or R, and there isn't an 
> optimized function appropriate for the innermost loops of your task at 
> hand, you are out of luck.
>


This is the important key takehome message, Julia is intended to allow both 
quick and simple and interactive and dynamic and optimised and fast code to 
written in one language.

I think Stefan announced Julia as "we want it all" :) 


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Scott Jones


On Friday, May 1, 2015 at 1:23:40 PM UTC-4, Steven G. Johnson wrote:
>
> On Friday, May 1, 2015 at 1:12:00 PM UTC-4, Steven Sagaert wrote: 
>>
>> That wasn't what I was saying. I like the philosophy behind julia. But in 
>> practice (as of now) even in julia you still have to code in a certain 
>> style if you want very good performance and that's no different than in any 
>> other language.
>>
>
> The goal of Julia is not to be a language in which it is *impossible* to 
> write slow code, or a language in which all programming styles are equally 
> fast.   The goal (or at least, one of the goals) is to be an expressive, 
> high-level dynamic language, in which it is also *possible* to write 
> performance-critical inner-loop code.
>

Yep, totally agree!  I had to deal with people (smart people too, who went 
to MIT also ;-) ) who expected the compiler/interpreter to magically 
improve their O(n^2) code!
 

> That *is* different from other high-level languages, in which it is 
> typically *not* possible to write performance-critical inner-loop code 
> without dropping down to a lower-level language (C, Fortran, Cython...).   
> If you are coding exclusively in Python or R, and there isn't an optimized 
> function appropriate for the innermost loops of your task at hand, you are 
> out of luck.
>

Also, very true...  I do hope that any issues that make my C version of UTF 
conversion routines faster than my equivalent Julia versions will be 
addressed before too long.
(and I don't even think it is that far off, or hard for any particular 
reason) 


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Scott Jones


On Friday, May 1, 2015 at 12:42:57 PM UTC-4, Steven G. Johnson wrote:
>
>
>
> On Thursday, April 30, 2015 at 6:10:58 PM UTC-4, Scott Jones wrote:
>>
>> Yes... Python will win on string processing... esp. with Python 3... I 
>> quickly ran into things that were > 800x faster in Python...
>> (I hope to help change that though!)
>>
>
> The "800x" faster example that you've referred to several times, if I 
> recall correctly, is one where you repeatedly concatenate strings.  In 
> CPython, under certain circumstances, this is optimized to mutating one of 
> the strings in-place and is consequently O(n) where n is the final length, 
> although this is not guaranteed by the language itself.  In Julia, Ruby, 
> Java, Go, and many other languages, concatenation allocates a new string 
> and hence building a string by repeated concatenation is O(n^2).   That 
> doesn't mean that those other languages "lose" on string processing to 
> Python, it just means that you have to do things slightly differently (e.g. 
> write to an IOBuffer in Julia).
>

I just don't think that IOBuffers are a very good way to do that...  what I 
really need are mutable strings... and I know there is a package, and I 
need to investigate that further...
it's something that would be nice to have as part of the core of the 
language, instead of having to use either Vectors or IOBuffers...
As a new users, I would think, if I'm not doing IO, why should be using an 
IOBuffer...
 

> You can't always expect the *same code* (translated as literally as 
> possible) to be the optimal approach in different languages, and it is 
> inflammatory to compare languages according to this standard.
>

I was not intending to be inflammatory, just relating what my first 
experience was, which let me to investigate much more deeply, into the good 
and bad issues in Julia wrt performance (more good than bad, by a long 
shot).
 

> A fairer question is whether it is *much harder* to get good performance 
> in one language vs. another for a certain task.   There will certainly be 
> tasks where Python is still superior in this sense simply because there are 
> many cases where Python calls highly tuned C libraries for operations that 
> have not been as optimized in Julia.  Julia will tend to shine the further 
> you stray from "built-in" operations in your performance-critical code.
>

Yes, that is true... and that is why I'm betting on Julia in the long run 
(the other option for a lot of the code would have been Python or C++11, 
and I've already found Julia easier to deal with than either of them, even 
in it's pre 1.0 state) 


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Scott Jones


On Friday, May 1, 2015 at 12:38:33 PM UTC-4, Jeff Bezanson wrote:
>
> Steven -- I agree and I find it very refreshing that you're willing to 
> judge a language by more than just performance. Any given language can 
> always be optimized better, so ideally you want to compare them by 
> more robust criteria. 
>
> Obviously a particular system might have a well-tuned library routine 
> that's faster than our equivalent. But think about it: is having a 
> slow interpreter, and relying on code to spend all its time in 
> pre-baked library kernels the *right* way to get performance? That's 
> just the same boring design that has been used over and over again, in 
> matlab, IDL, octave, R, etc. In those cases the language isn't 
> bringing much to the table, except a pile of rules about how important 
> code must still be written in C/Fortran, and how your code must be 
> vectorized or shame on you. 
>

That's a very good point... and is one of the things I like a lot about 
Julia...
Even with my initial surprise about a single performance issue (the 
building up a string by concatenation), I did NOT judge Julia by that alone,
and have been quite happy with it overall [and I've been converting all of 
the developers at the startup where I'm consulting to Julia fans].
I also have faith, from what I've seen so far, is that performance issues 
*will* be addressed, as best as possible considering the architecture and 
goals of the language,
by a number of pretty smart people, both in and outside of the "core" team.

Scott


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Scott Jones


On Friday, May 1, 2015 at 11:48:21 AM UTC-4, Tim Holy wrote:
>
> On Friday, May 01, 2015 08:03:31 AM Scott Jones wrote: 
> > Still, same issue as I described above... probably better to increase by 
> 2x 
> > up to a point, and then by chunk sizes, where the chunk sizes might 
> slowly 
> > get larger... 
>
> I see your point, but it will also break the O(nlogn) scaling. We couldn't 
> hard-code the cutoff, because some people run julia on machines with 4GB 
> of RAM 
> and others with 1TB of RAM. So, we could query the amount of RAM available 
> and 
> switch based on that result, but since all this would only make a 
> difference 
> for operations that consume between 0.5x and 1x the user's RAM (which to 
> me 
> seems like a very narrow window, on the log scale), is it really worth the 
> trouble? 
>
> --Tim 
>

For what I was doing, yes, it was definitely worth the trouble, because 
you'd have systems with 10s of thousands of processes (the limit was 64K on 
a single node), and you had to be very careful about not using up too much 
memory, and ending up thrashing...
Very different than when you maybe have a process for each core, and you 
have lots of memory for each one...
Different usage... different performance issues... 


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Jameson Nash
I believe that both are actually very similar in that manner. I think the
main difference comes from the fact that Julia is an attempt to design the
core library to support and use the efficient constructs, while Numba and
other related projects are, for better or worse, inheriting the default
python semantics and built-in libraries.

Sometimes a new language is better than an old language simply because it
can drop compatibility concerns. For example, Java is known for providing
far more consistent multi-threading support than C, since it is a language
construct and not an add-on feature. It was possible in both, one just made
it easier for the programmer to access. Similarly, Node made it feasible to
write programs without any concept of a blocking operation. Again, this was
already possible in languages like Python and C, but Node (with it's legacy
in Javascript), made it a feature of the language and designed all of the
core API's to deal with it.


On Fri, May 1, 2015 at 2:27 PM Steven G. Johnson 
wrote:

>
>
> On Friday, May 1, 2015 at 2:04:44 PM UTC-4, Steven Sagaert wrote:
>>
>> like I said: I like Julia and I am rooting for it but just to play
>> devil's advocate: I believe it's also a goal (& possibility) of numba to
>> write c-level efficient code in Python. All you have to do add an
>> annotation here and there.
>>
>
> Numba is arguably a 2nd lower-level language that happens to be embedded
> in Python — it is telling that Numba's documentation explicitly states that
> it can only get good performance when it is able to JIT the inner loops in
> "nopython mode" — basically, code that doesn't stray outside a small set of
> types.
>


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Steven G. Johnson


On Friday, May 1, 2015 at 2:04:44 PM UTC-4, Steven Sagaert wrote:
>
> like I said: I like Julia and I am rooting for it but just to play devil's 
> advocate: I believe it's also a goal (& possibility) of numba to write 
> c-level efficient code in Python. All you have to do add an annotation here 
> and there. 
>

Numba is arguably a 2nd lower-level language that happens to be embedded in 
Python — it is telling that Numba's documentation explicitly states that it 
can only get good performance when it is able to JIT the inner loops in 
"nopython mode" — basically, code that doesn't stray outside a small set of 
types.


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Steven Sagaert


On Friday, May 1, 2015 at 7:23:40 PM UTC+2, Steven G. Johnson wrote:
>
> On Friday, May 1, 2015 at 1:12:00 PM UTC-4, Steven Sagaert wrote: 
>>
>> That wasn't what I was saying. I like the philosophy behind julia. But in 
>> practice (as of now) even in julia you still have to code in a certain 
>> style if you want very good performance and that's no different than in any 
>> other language.
>>
>
> The goal of Julia is not to be a language in which it is *impossible* to 
> write slow code, or a language in which all programming styles are equally 
> fast. 
>

I didn't say that was a goal of Julia but it sure  would be nice to have 
though :) but probably an impossible dream.
 

>   The goal (or at least, one of the goals) is to be an expressive, 
> high-level dynamic language, in which it is also *possible* to write 
> performance-critical inner-loop code.
>
> That *is* different from other high-level languages, in which it is 
> typically *not* possible to write performance-critical inner-loop code 
> without dropping down to a lower-level language (C, Fortran, Cython...).   
> If you are coding exclusively in Python or R, and there isn't an optimized 
> function appropriate for the innermost loops of your task at hand, you are 
> out of luck.
>
like I said: I like Julia and I am rooting for it but just to play devil's 
advocate: I believe it's also a goal (& possibility) of numba to write 
c-level efficient code in Python. All you have to do add an annotation here 
and there. 


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Steven G. Johnson
On Friday, May 1, 2015 at 1:12:00 PM UTC-4, Steven Sagaert wrote: 
>
> That wasn't what I was saying. I like the philosophy behind julia. But in 
> practice (as of now) even in julia you still have to code in a certain 
> style if you want very good performance and that's no different than in any 
> other language.
>

The goal of Julia is not to be a language in which it is *impossible* to 
write slow code, or a language in which all programming styles are equally 
fast.   The goal (or at least, one of the goals) is to be an expressive, 
high-level dynamic language, in which it is also *possible* to write 
performance-critical inner-loop code.

That *is* different from other high-level languages, in which it is 
typically *not* possible to write performance-critical inner-loop code 
without dropping down to a lower-level language (C, Fortran, Cython...).   
If you are coding exclusively in Python or R, and there isn't an optimized 
function appropriate for the innermost loops of your task at hand, you are 
out of luck.


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Steven Sagaert



>
> Obviously a particular system might have a well-tuned library routine 
> that's faster than our equivalent. But think about it: is having a 
> slow interpreter, and relying on code to spend all its time in 
> pre-baked library kernels the *right* way to get performance? That's 
> just the same boring design that has been used over and over again, in 
> matlab, IDL, octave, R, etc. In those cases the language isn't 
> bringing much to the table, except a pile of rules about how important 
> code must still be written in C/Fortran, and how your code must be 
> vectorized or shame on you.


That wasn't what I was saying. I like the philosophy behind julia. But in 
practice (as of now) even in julia you still have to code in a certain 
style if you want very good performance and that's no different than in any 
other language. Ideally of course the compiler should be able to optimize 
the code so that different styles (e.g. functional/vectorized style vs 
imperative/loops style) gives the same performance and the programmer 
doesn't have to think about it and maybe one day it will be like that in 
julia but we're not quite there yet AFAIK.

Having said that, I like Julia and hopefully it will keep on getting 
better/faster. So good job and keep up the good work.

>
>
> On Fri, May 1, 2015 at 11:48 AM, Tim Holy > 
> wrote: 
> > On Friday, May 01, 2015 08:03:31 AM Scott Jones wrote: 
> >> Still, same issue as I described above... probably better to increase 
> by 2x 
> >> up to a point, and then by chunk sizes, where the chunk sizes might 
> slowly 
> >> get larger... 
> > 
> > I see your point, but it will also break the O(nlogn) scaling. We 
> couldn't 
> > hard-code the cutoff, because some people run julia on machines with 4GB 
> of RAM 
> > and others with 1TB of RAM. So, we could query the amount of RAM 
> available and 
> > switch based on that result, but since all this would only make a 
> difference 
> > for operations that consume between 0.5x and 1x the user's RAM (which to 
> me 
> > seems like a very narrow window, on the log scale), is it really worth 
> the 
> > trouble? 
> > 
> > --Tim 
> > 
>


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Stefan Karpinski
I'll quote one of my comments on this StackOverflow question

:

That all depends on what you are trying to measure. Personally, I'm not at
> all interested in how fast one can compute Fibonacci numbers. Yet that is
> one of our benchmarks. Why? Because I am very interested in how well
> languages support recursion – and the doubly recursive algorithm happens to
> be a great test of recursion, precisely because it is such a terrible way
> to compute Fibonacci numbers. So what would be learned by comparing an
> intentionally slow, excessively recursive algorithm in C and Julia against
> a tricky, clever, vectorized algorithm in R? Nothing at all.


On Fri, May 1, 2015 at 12:58 PM, Steven Sagaert 
wrote:

> Of course I'm not saying loops should not be benchmarked and I do use
> loops in julia also. I'm just saying that when doing performance comparison
> one should try to write the programs in each language in their most optimal
> style rather than similar style which is optimal for one language but very
> suboptimal in another language.
> Ah I didn't know the article was rebutted by Stefan. I read that article
> before that happened and just looked it up again now as an example.
>
> I guess the conclusion is that cross-language performance benchmarks are
> very tricky which was kinda my point :)
>
>
> On Friday, May 1, 2015 at 3:13:24 PM UTC+2, Tim Holy wrote:
>>
>> Hi Steven,
>>
>> I understand your point---you're saying you'd be unlikely to write those
>> algorithms in that manner, if your goal were to do those particular
>> computations. But the important point to keep in mind is that those
>> benchmarks
>> are simply "toys" for the purpose of testing performance of various
>> language
>> constructs. If you think it's irrelevant to benchmark loops for
>> scientific
>> code, then you do very, very different stuff than me. Not all algorithms
>> reduce
>> to BLAS calls. I use julia to write all kinds of algorithms that I used
>> to
>> write MEX functions for, back in my Matlab days. If all you need is A*b,
>> then
>> of course basically any scientific language will be just fine, with
>> minimal
>> differences in performance.
>>
>> Moreover, that R benchmark on cumsum is simply not credible. I'm not sure
>> what
>> was happening (and that article doesn't post its code or procedures used
>> to
>> test), but julia's cumsum reduces to efficient machine code (basically, a
>> bunch
>> of addition operations). If they were computing cumsum across a specific
>> dimension, then this PR:
>> https://github.com/JuliaLang/julia/pull/7359
>> changed things. But more likely, someone forgot to run the code twice (so
>> it
>> got JIT-compiled), had a type-instability in the code they were testing,
>> or
>> some other mistake. It's too bad one can make mistakes, of course, but
>> then it
>> becomes a comparison of different programmers rather than different
>> programming
>> languages.
>>
>> Indeed, if you read the comments in that post, Stefan already rebutted
>> that
>> benchmark, with a 4x advantage for Julia:
>>
>> https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/comment-page-1/#comment-89
>>
>> --Tim
>>
>>
>>
>> On Friday, May 01, 2015 01:25:50 AM Steven Sagaert wrote:
>> > I think the performance comparisons between Julia & Python are flawed.
>> They
>> > seem to be between standard Python & Julia but since Julia is all about
>> > scientific programming it really should be between SciPi & Julia. Since
>> > SciPi uses much of the same underlying libs in Fortran/C the
>> performance
>> > gap will be much smaller and to be really fair it should be between
>> numba
>> > compiled SciPi code & julia. I suspect the performance will be very
>> close
>> > then (and close to C performance).
>> >
>> > Similarly the standard benchmark (on the opening page of julia website)
>> > between R & julia is also flawed because it takes the best case
>> scenario
>> > for julia (loops & mutable datastructures) & the worst case scenario
>> for R.
>> > When the same R program is rewritten in vectorised style it beat julia
>> > see
>> >
>> https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyon
>> > e-else-wanna-challenge-r/.
>> >
>> > So my interest in julia isn't because it is the fastest scientific high
>> > level language (because clearly at this stage you can't really claim
>> that)
>> > but because it's a clean interesting language (still needs work for
>> some
>> > rough edges of course) with clean(er) & clear(er) libraries  and that
>> gives
>> > reasonable performance out of the box without much tweaking.
>> >
>> > On Friday, May 1, 2015 at 12:10:58 AM UTC+2, Scott Jones wrote:
>> > > Yes... Python will win on string processing... esp. with Python 3...
>> I
>> > > quickly ran into things that were > 800x faster in Python...
>> > > (I hope to help change that though!)
>> > >
>> >

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Steven Sagaert
Of course I'm not saying loops should not be benchmarked and I do use loops 
in julia also. I'm just saying that when doing performance comparison one 
should try to write the programs in each language in their most optimal 
style rather than similar style which is optimal for one language but very 
suboptimal in another language.
Ah I didn't know the article was rebutted by Stefan. I read that article 
before that happened and just looked it up again now as an example.

I guess the conclusion is that cross-language performance benchmarks are 
very tricky which was kinda my point :)

On Friday, May 1, 2015 at 3:13:24 PM UTC+2, Tim Holy wrote:
>
> Hi Steven, 
>
> I understand your point---you're saying you'd be unlikely to write those 
> algorithms in that manner, if your goal were to do those particular 
> computations. But the important point to keep in mind is that those 
> benchmarks 
> are simply "toys" for the purpose of testing performance of various 
> language 
> constructs. If you think it's irrelevant to benchmark loops for scientific 
> code, then you do very, very different stuff than me. Not all algorithms 
> reduce 
> to BLAS calls. I use julia to write all kinds of algorithms that I used to 
> write MEX functions for, back in my Matlab days. If all you need is A*b, 
> then 
> of course basically any scientific language will be just fine, with 
> minimal 
> differences in performance. 
>
> Moreover, that R benchmark on cumsum is simply not credible. I'm not sure 
> what 
> was happening (and that article doesn't post its code or procedures used 
> to 
> test), but julia's cumsum reduces to efficient machine code (basically, a 
> bunch 
> of addition operations). If they were computing cumsum across a specific 
> dimension, then this PR: 
> https://github.com/JuliaLang/julia/pull/7359 
> changed things. But more likely, someone forgot to run the code twice (so 
> it 
> got JIT-compiled), had a type-instability in the code they were testing, 
> or 
> some other mistake. It's too bad one can make mistakes, of course, but 
> then it 
> becomes a comparison of different programmers rather than different 
> programming 
> languages. 
>
> Indeed, if you read the comments in that post, Stefan already rebutted 
> that 
> benchmark, with a 4x advantage for Julia: 
>
> https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/comment-page-1/#comment-89
>  
>
> --Tim 
>
>
>
> On Friday, May 01, 2015 01:25:50 AM Steven Sagaert wrote: 
> > I think the performance comparisons between Julia & Python are flawed. 
> They 
> > seem to be between standard Python & Julia but since Julia is all about 
> > scientific programming it really should be between SciPi & Julia. Since 
> > SciPi uses much of the same underlying libs in Fortran/C the performance 
> > gap will be much smaller and to be really fair it should be between 
> numba 
> > compiled SciPi code & julia. I suspect the performance will be very 
> close 
> > then (and close to C performance). 
> > 
> > Similarly the standard benchmark (on the opening page of julia website) 
> > between R & julia is also flawed because it takes the best case scenario 
> > for julia (loops & mutable datastructures) & the worst case scenario for 
> R. 
> > When the same R program is rewritten in vectorised style it beat julia 
> > see 
> > 
> https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyon 
> > e-else-wanna-challenge-r/. 
> > 
> > So my interest in julia isn't because it is the fastest scientific high 
> > level language (because clearly at this stage you can't really claim 
> that) 
> > but because it's a clean interesting language (still needs work for some 
> > rough edges of course) with clean(er) & clear(er) libraries  and that 
> gives 
> > reasonable performance out of the box without much tweaking. 
> > 
> > On Friday, May 1, 2015 at 12:10:58 AM UTC+2, Scott Jones wrote: 
> > > Yes... Python will win on string processing... esp. with Python 3... I 
> > > quickly ran into things that were > 800x faster in Python... 
> > > (I hope to help change that though!) 
> > > 
> > > Scott 
> > > 
> > > On Thursday, April 30, 2015 at 6:01:45 PM UTC-4, Páll Haraldsson 
> wrote: 
> > >> I wouldn't expect a difference in Julia for code like that (didn't 
> > >> check). But I guess what we are often seeing is someone comparing a 
> tuned 
> > >> Python code to newbie Julia code. I still want it faster than that 
> code.. 
> > >> (assuming same algorithm, note row vs. column major caveat). 
> > >> 
> > >> The main point of mine, *should* Python at any time win? 
> > >> 
> > >> 2015-04-30 21:36 GMT+00:00 Sisyphuss : 
> > >>> This post interests me. I'll write something here to follow this 
> post. 
> > >>> 
> > >>> The performance gap between normal code in Python and badly-written 
> code 
> > >>> in Julia is something I'd like to know too. 
> > >>> As far as I know, Python interpret does some mysterious 
> optimiz

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Steven G. Johnson


On Thursday, April 30, 2015 at 6:10:58 PM UTC-4, Scott Jones wrote:
>
> Yes... Python will win on string processing... esp. with Python 3... I 
> quickly ran into things that were > 800x faster in Python...
> (I hope to help change that though!)
>

The "800x" faster example that you've referred to several times, if I 
recall correctly, is one where you repeatedly concatenate strings.  In 
CPython, under certain circumstances, this is optimized to mutating one of 
the strings in-place and is consequently O(n) where n is the final length, 
although this is not guaranteed by the language itself.  In Julia, Ruby, 
Java, Go, and many other languages, concatenation allocates a new string 
and hence building a string by repeated concatenation is O(n^2).   That 
doesn't mean that those other languages "lose" on string processing to 
Python, it just means that you have to do things slightly differently (e.g. 
write to an IOBuffer in Julia).

You can't always expect the *same code* (translated as literally as 
possible) to be the optimal approach in different languages, and it is 
inflammatory to compare languages according to this standard.

A fairer question is whether it is *much harder* to get good performance in 
one language vs. another for a certain task.   There will certainly be 
tasks where Python is still superior in this sense simply because there are 
many cases where Python calls highly tuned C libraries for operations that 
have not been as optimized in Julia.  Julia will tend to shine the further 
you stray from "built-in" operations in your performance-critical code.


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Jameson Nash
The threshold would likely be most beneficial if it was based on pagesize
(which is constant relative to RAM size). For small allocations (less than
several megabytes), a modern malloc implementation typically uses a pool,
so growing a allocation (except by a small amount) will probably result in
a copy anyways, and no memory reuse. Once malloc switches to direct mmap
calls, then it probably makes sense to add pages at a more gradual rate.

On Fri, May 1, 2015 at 11:48 AM Tim Holy  wrote:

> On Friday, May 01, 2015 08:03:31 AM Scott Jones wrote:
> > Still, same issue as I described above... probably better to increase by
> 2x
> > up to a point, and then by chunk sizes, where the chunk sizes might
> slowly
> > get larger...
>
> I see your point, but it will also break the O(nlogn) scaling. We couldn't
> hard-code the cutoff, because some people run julia on machines with 4GB
> of RAM
> and others with 1TB of RAM. So, we could query the amount of RAM available
> and
> switch based on that result, but since all this would only make a
> difference
> for operations that consume between 0.5x and 1x the user's RAM (which to me
> seems like a very narrow window, on the log scale), is it really worth the
> trouble?
>
> --Tim
>
>


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Jeff Bezanson
Steven -- I agree and I find it very refreshing that you're willing to
judge a language by more than just performance. Any given language can
always be optimized better, so ideally you want to compare them by
more robust criteria.

Obviously a particular system might have a well-tuned library routine
that's faster than our equivalent. But think about it: is having a
slow interpreter, and relying on code to spend all its time in
pre-baked library kernels the *right* way to get performance? That's
just the same boring design that has been used over and over again, in
matlab, IDL, octave, R, etc. In those cases the language isn't
bringing much to the table, except a pile of rules about how important
code must still be written in C/Fortran, and how your code must be
vectorized or shame on you.

On Fri, May 1, 2015 at 11:48 AM, Tim Holy  wrote:
> On Friday, May 01, 2015 08:03:31 AM Scott Jones wrote:
>> Still, same issue as I described above... probably better to increase by 2x
>> up to a point, and then by chunk sizes, where the chunk sizes might slowly
>> get larger...
>
> I see your point, but it will also break the O(nlogn) scaling. We couldn't
> hard-code the cutoff, because some people run julia on machines with 4GB of 
> RAM
> and others with 1TB of RAM. So, we could query the amount of RAM available and
> switch based on that result, but since all this would only make a difference
> for operations that consume between 0.5x and 1x the user's RAM (which to me
> seems like a very narrow window, on the log scale), is it really worth the
> trouble?
>
> --Tim
>


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Tim Holy
On Friday, May 01, 2015 08:03:31 AM Scott Jones wrote:
> Still, same issue as I described above... probably better to increase by 2x 
> up to a point, and then by chunk sizes, where the chunk sizes might slowly
> get larger...

I see your point, but it will also break the O(nlogn) scaling. We couldn't 
hard-code the cutoff, because some people run julia on machines with 4GB of RAM 
and others with 1TB of RAM. So, we could query the amount of RAM available and 
switch based on that result, but since all this would only make a difference 
for operations that consume between 0.5x and 1x the user's RAM (which to me 
seems like a very narrow window, on the log scale), is it really worth the 
trouble?

--Tim



Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Scott Jones


On Friday, May 1, 2015 at 9:16:43 AM UTC-4, Tim Holy wrote:
>
> On Friday, May 01, 2015 03:19:03 AM Scott Jones wrote: 
> > As the string grows, Julia's internals end up having to reallocate the 
> > memory and sometimes copy it to a new location, hence the O(n^2) nature 
> of 
> > the code. 
>
> Small correction: push! is not O(n^2), it's O(nlogn). Internally, the 
> storage 
> array grows by factors of 2 [1]; after one allocation of size 2n you can 
> add n 
> more elements without reallocating. 
>

Good to know, I hate to say it, but the performance looked so bad to me, I 
didn't bother to see if it even had that optimization (which is exactly 
what I did for strings for the language I used to develop)

Does it always grow by factors of 2?  That might not be so good...  we 
found that after a certain point, it was better to increase in chunks, say 
of 64K, or 1M, because increasing the size that way of large LOBs could 
make you run out of memory fairly quickly...

 

> That said, O(nlogn) can be pretty easily beat by O(2n): make one pass 
> through 
> and count how many you'll need, allocate the whole thing, and then stuff 
> in 
> elements. As you seem to be planning to do. 
>

Yes, and have very nice performance improvements to show for it (most were 
around 4-10x faster, go look at what I put in my gist), and that's even 
with my pure Julia
version... :-)
 

>
> --Tim 
>
> [1] Last I looked, that is; there was some discussion about switching it 
> to 
> something like 1.5 because of various discussions of memory fragmentation 
> and 
> reuse. 
>

Still, same issue as I described above... probably better to increase by 2x 
up to a point, and then by chunk sizes, where the chunk sizes might slowly 
get larger...
 


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Tim Holy
On Friday, May 01, 2015 03:19:03 AM Scott Jones wrote:
> As the string grows, Julia's internals end up having to reallocate the 
> memory and sometimes copy it to a new location, hence the O(n^2) nature of 
> the code.

Small correction: push! is not O(n^2), it's O(nlogn). Internally, the storage 
array grows by factors of 2 [1]; after one allocation of size 2n you can add n 
more elements without reallocating.

That said, O(nlogn) can be pretty easily beat by O(2n): make one pass through 
and count how many you'll need, allocate the whole thing, and then stuff in 
elements. As you seem to be planning to do.

--Tim

[1] Last I looked, that is; there was some discussion about switching it to 
something like 1.5 because of various discussions of memory fragmentation and 
reuse.



Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Tim Holy
Hi Steven,

I understand your point---you're saying you'd be unlikely to write those 
algorithms in that manner, if your goal were to do those particular 
computations. But the important point to keep in mind is that those benchmarks 
are simply "toys" for the purpose of testing performance of various language 
constructs. If you think it's irrelevant to benchmark loops for scientific 
code, then you do very, very different stuff than me. Not all algorithms reduce 
to BLAS calls. I use julia to write all kinds of algorithms that I used to 
write MEX functions for, back in my Matlab days. If all you need is A*b, then 
of course basically any scientific language will be just fine, with minimal 
differences in performance.

Moreover, that R benchmark on cumsum is simply not credible. I'm not sure what 
was happening (and that article doesn't post its code or procedures used to 
test), but julia's cumsum reduces to efficient machine code (basically, a bunch 
of addition operations). If they were computing cumsum across a specific 
dimension, then this PR:
https://github.com/JuliaLang/julia/pull/7359
changed things. But more likely, someone forgot to run the code twice (so it 
got JIT-compiled), had a type-instability in the code they were testing, or 
some other mistake. It's too bad one can make mistakes, of course, but then it 
becomes a comparison of different programmers rather than different programming 
languages.

Indeed, if you read the comments in that post, Stefan already rebutted that 
benchmark, with a 4x advantage for Julia:
https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/comment-page-1/#comment-89

--Tim



On Friday, May 01, 2015 01:25:50 AM Steven Sagaert wrote:
> I think the performance comparisons between Julia & Python are flawed. They
> seem to be between standard Python & Julia but since Julia is all about
> scientific programming it really should be between SciPi & Julia. Since
> SciPi uses much of the same underlying libs in Fortran/C the performance
> gap will be much smaller and to be really fair it should be between numba
> compiled SciPi code & julia. I suspect the performance will be very close
> then (and close to C performance).
> 
> Similarly the standard benchmark (on the opening page of julia website)
> between R & julia is also flawed because it takes the best case scenario
> for julia (loops & mutable datastructures) & the worst case scenario for R.
> When the same R program is rewritten in vectorised style it beat julia
> see
> https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyon
> e-else-wanna-challenge-r/.
> 
> So my interest in julia isn't because it is the fastest scientific high
> level language (because clearly at this stage you can't really claim that)
> but because it's a clean interesting language (still needs work for some
> rough edges of course) with clean(er) & clear(er) libraries  and that gives
> reasonable performance out of the box without much tweaking.
> 
> On Friday, May 1, 2015 at 12:10:58 AM UTC+2, Scott Jones wrote:
> > Yes... Python will win on string processing... esp. with Python 3... I
> > quickly ran into things that were > 800x faster in Python...
> > (I hope to help change that though!)
> > 
> > Scott
> > 
> > On Thursday, April 30, 2015 at 6:01:45 PM UTC-4, Páll Haraldsson wrote:
> >> I wouldn't expect a difference in Julia for code like that (didn't
> >> check). But I guess what we are often seeing is someone comparing a tuned
> >> Python code to newbie Julia code. I still want it faster than that code..
> >> (assuming same algorithm, note row vs. column major caveat).
> >> 
> >> The main point of mine, *should* Python at any time win?
> >> 
> >> 2015-04-30 21:36 GMT+00:00 Sisyphuss :
> >>> This post interests me. I'll write something here to follow this post.
> >>> 
> >>> The performance gap between normal code in Python and badly-written code
> >>> in Julia is something I'd like to know too.
> >>> As far as I know, Python interpret does some mysterious optimizations.
> >>> For example `(x**2)**2` is 100x faster than `x**4`.
> >>> 
> >>> On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote:
>  Hi,
>  
>  [As a best language is subjective, I'll put that aside for a moment.]
>  
>  Part I.
>  
>  The goal, as I understand, for Julia is at least within a factor of two
>  of C and already matching it mostly and long term beating that (and
>  C++).
>  [What other goals are there? How about 0.4 now or even 1.0..?]
>  
>  While that is the goal as a language, you can write slow code in any
>  language and Julia makes that easier. :) [If I recall, Bezanson
>  mentioned
>  it (the global "problem") as a feature, any change there?]
>  
>  
>  I've been following this forum for months and newbies hit the same
>  issues. But almost always without fail, Julia can be speed up (easily
> 

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Patrick O'Leary
On Friday, May 1, 2015 at 3:25:50 AM UTC-5, Steven Sagaert wrote:
>
> I think the performance comparisons between Julia & Python are flawed. 
> They seem to be between standard Python & Julia but since Julia is all 
> about scientific programming it really should be between SciPi & Julia. 
> Since SciPi uses much of the same underlying libs in Fortran/C the 
> performance gap will be much smaller and to be really fair it should be 
> between numba compiled SciPi code & julia. I suspect the performance will 
> be very close then (and close to C performance).
>
> Similarly the standard benchmark (on the opening page of julia website) 
> between R & julia is also flawed because it takes the best case scenario 
> for julia (loops & mutable datastructures) & the worst case scenario for R. 
> When the same R program is rewritten in vectorised style it beat julia see 
> https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/
> .
>

All benchmarks are flawed in that sense--a single benchmark can't tell you 
everything. The Julia performance benchmarks are testing algorithms 
expressed in the langauges themselves. It is not a test of foreign-function 
interfaces and BLAS implementations, so the benchmarks don't test that. 
This has been discussed at length--as one example, see 
https://github.com/JuliaLang/julia/issues/2412.


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Steven Sagaert


On Friday, May 1, 2015 at 12:26:54 PM UTC+2, Scott Jones wrote:
>
>
>
> On Friday, May 1, 2015 at 4:25:50 AM UTC-4, Steven Sagaert wrote:
>>
>> I think the performance comparisons between Julia & Python are flawed. 
>> They seem to be between standard Python & Julia but since Julia is all 
>> about scientific programming it really should be between SciPi & Julia. 
>> Since SciPi uses much of the same underlying libs in Fortran/C the 
>> performance gap will be much smaller and to be really fair it should be 
>> between numba compiled SciPi code & julia. I suspect the performance will 
>> be very close then (and close to C performance).
>>
>
> Why should Julia be limited to scientific programming?
> I think it can be a great language for general programming, 
>

I agree but for now & the short time future I think the core domain of 
julia is scientific computing/data science and so to have fair comparisons 
one should not just compare julia to vanilla Python but  especially scipi & 
numba.
 

> for the most part, I think it already is (it can use some changes for 
> string handling [I'd like to work on that ;-)], decimal floating point 
> support [that is currently being addressed, kudos to Steven G. Johnson], 
> maybe some better language constructs to allow better software engineering 
> practices [that is being hotly debated!], and definitely a real debugger [I 
> think keno is working on that]).
>

> Comparing Julia to Python for general computing is totally valid and 
> interesting.
> Comparing Julia to SciPy for scientific computing is also totally valid 
> and interesting.
>
> Similarly the standard benchmark (on the opening page of julia website) 
>> between R & julia is also flawed because it takes the best case scenario 
>> for julia (loops & mutable datastructures) & the worst case scenario for R. 
>> When the same R program is rewritten in vectorised style it beat julia see 
>> https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/
>> .
>>
>> So my interest in julia isn't because it is the fastest scientific high 
>> level language (because clearly at this stage you can't really claim that) 
>> but because it's a clean interesting language (still needs work for some 
>> rough edges of course) with clean(er) & clear(er) libraries  and that gives 
>> reasonable performance out of the box without much tweaking. 
>>
>

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Scott Jones


On Friday, May 1, 2015 at 4:25:50 AM UTC-4, Steven Sagaert wrote:
>
> I think the performance comparisons between Julia & Python are flawed. 
> They seem to be between standard Python & Julia but since Julia is all 
> about scientific programming it really should be between SciPi & Julia. 
> Since SciPi uses much of the same underlying libs in Fortran/C the 
> performance gap will be much smaller and to be really fair it should be 
> between numba compiled SciPi code & julia. I suspect the performance will 
> be very close then (and close to C performance).
>

Why should Julia be limited to scientific programming?
I think it can be a great language for general programming, for the most 
part, I think it already is (it can use some changes for string handling 
[I'd like to work on that ;-)], decimal floating point support [that is 
currently being addressed, kudos to Steven G. Johnson], maybe some better 
language constructs to allow better software engineering practices [that is 
being hotly debated!], and definitely a real debugger [I think keno is 
working on that]).

Comparing Julia to Python for general computing is totally valid and 
interesting.
Comparing Julia to SciPy for scientific computing is also totally valid and 
interesting.

Similarly the standard benchmark (on the opening page of julia website) 
> between R & julia is also flawed because it takes the best case scenario 
> for julia (loops & mutable datastructures) & the worst case scenario for R. 
> When the same R program is rewritten in vectorised style it beat julia see 
> https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/
> .
>
> So my interest in julia isn't because it is the fastest scientific high 
> level language (because clearly at this stage you can't really claim that) 
> but because it's a clean interesting language (still needs work for some 
> rough edges of course) with clean(er) & clear(er) libraries  and that gives 
> reasonable performance out of the box without much tweaking. 
>


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Scott Jones


On Friday, May 1, 2015 at 1:25:41 AM UTC-4, Jeff Bezanson wrote:
>
> It is true that we have not yet done enough to optimize the worst and 
> worse performance cases. The bright side of that is that we have room 
> to improve; it's not that we've run out of ideas and techniques. 
>
> Tim is right that the complexity of our dispatch system makes julia 
> potentially slower than python. But in dispatch-heavy code I've seen 
> cases where we are faster or slower; it depends. 
>
> Python's string and dictionary operations, in particular, are really 
> fast. This is not surprising considering what the language was 
> designed for, and that they have a big library of well-tuned C code 
> for these things. 
>
> I still maintain that it is misleading to describe an *asymptotic* 
> slowdown as "800x slower". If you name a constant factor, it sounds 
> like you're talking about a constant factor slowdown. But the number 
> is arbitrary, because it depends on data size. In theory, of course, 
> an asymptotic slowdown is *much worse* than a constant factor 
> slowdown. However in the systems world constant factors are often more 
> important, and are often what we talk about. 
>

No, that was just my very first test comparing Julia & Python, using a size 
that matched the record sizes I'd typically seen from way too many years of
benchmarking (database / string processing operations)
 

> You say "a lot of the algorithms are O(n) instead of O(1)". Are there 
> any examples other than length()? 
>

Actually, it's worse than that... length, and getting finding a particular 
character by character position, and getting a substring by character 
position, some of the most frequent operations for what I deal with, are 
O(n) instead of O(1),  and things like conversions are O(n^2), not O(n) 
[and the conversions are much more complex, due to the string 
representation in Julia, unlike Python 3].
The conversions I am fixing, so that they are not O(n^2), but rather O(n) 
[slower than Python, again because of the representation, but not 
asymptotic].
The reason they are O(n^2), like the string concatenation problem I ran 
into right when I first started to evaluate Julia, is because of the way 
the conversion functions are written,
initially creating a 0-length array, and then doing push! to successively 
add characters to the array, and then finally calling UTF8String, 
UTF16String, or UTF32String to convert
the Vector{UInt8}, Vector{UInt16} or Vector{Char} respectively into an 
immutable string.
As the string grows, Julia's internals end up having to reallocate the 
memory and sometimes copy it to a new location, hence the O(n^2) nature of 
the code.

My changes, which hopefully will be accepted (after I check in my next 
round of pure Julia optimizations), solve that by first validating the 
input UTF-8, UTF-16, or UTF-32
string at the same time as calculating how many characters of the different 
ranges are present, so that the memory can be allocated once, exactly the 
size needed, and also
frequently allowing dispatching to simpler conversion code, when it is know 
that all of the characters in the string just need to be widened 
(zero-extended), or narrowed.

I disagree that UTF-8 has no space savings over UTF-32 when using the 
> full range of unicode. The reason is that strings often have only a 
> small percentage of non-BMP characters, with lots of spaces and 
> newlines etc. You don't want your whole file to use 4x the space just 
> to use one emoji. 
>

Please read my statement more carefully...

> UTF-8 *can* take up to 50% more storage than UTF-16 if you are just 
> dealing with BMP characters.
> If you have some field that needs to hold *a certain number of Unicode 
> characters*, for the full range of Unicode,
> you need to allocate 4 bytes for every character, so no savings compared 
> to UTF-16 or UTF-32.


My point was that if you have to allocate a buffer to hold a certain # of 
characters, say because you have a CHAR, NCHAR, or WCHAR, or VARCHAR, etc. 
field from a DBMS,
for UTF-8, you need to allocate at least 4 bytes per character, so no 
savings over UTF-16 or UTF-32 for those operations...

I spent over two years going back and forth to Japan, when I designed (and 
was the main implementor) for the Unicode support of a database system / 
language, and spent a lot of time looking at the just how much storage 
space different representations would take... Note, at that time, Unicode 
2.0 was not out, so the choice was between UCS-2 (no surrogates then), 
UTF-8, some combination thereof, or some new encoding.

My first version, released finally in 1997, used either 8-bit (ANSI Latin 
1) or UCS-2 to store data...  The next release, I came up with a new 
encoding for Unicode, that was much more compact (at the insistence of the 
Japanese customers, who didn't want their storage requirements to increase 
because of moving from SJIS and EUC to Unicode).
In memory, all strings were UCS-2 (or really UTF-16, 

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Scott Jones
I just read through all of that very interesting thread on exceptions... it 
seems that Stefan was trying to reinvent the wheel, without knowing it.

Everybody interested in exception handling should go look up CLU... Julia 
seems to have gotten a lot of ideas from CLU (possibly rather indirectly,
through C++, Java, Lua...).
CLU had this well handled 40 years ago ;-)

Scott

On Friday, May 1, 2015 at 12:42:47 AM UTC-4, Harry B wrote:
>
> Sorry my comment wasn't well thought out and a bit off topic. On 
> exceptions/errors my issue is this 
> https://github.com/JuliaLang/julia/issues/7026
> On profiling, I was comparing to Go, but again off topic and I take my 
> comment back. I don't have any intelligent remarks to add (yet!) :)
> Thank you for the all the work you are doing. 
>
> On Thursday, April 30, 2015 at 7:00:01 PM UTC-7, Tim Holy wrote:
>>
>> Harry, I'm curious about 2 of your 3 last points: 
>>
>> On Thursday, April 30, 2015 05:50:15 PM Harry B wrote: 
>> > (exceptions?, debugging, profiling tools) 
>>
>> We have exceptions. What aspect are you referring to? 
>> Debugger: yes, that's missing, and it's a huge gap. 
>> Profiling tools: in my view we're doing OK (better than Matlab, in my 
>> opinion), 
>> but what do you see as missing? 
>>
>> --Tim 
>>
>> > 
>> > Thanks 
>> > -- 
>> > Harry 
>> > 
>> > On Thursday, April 30, 2015 at 3:43:36 PM UTC-7, Páll Haraldsson wrote: 
>> > > It seemed to me tuples where slow because of Any used. I understand 
>> tuples 
>> > > have been fixed, I'm not sure how. 
>> > > 
>> > > I do not remember the post/all the details. Yes, tuples where slow/er 
>> than 
>> > > Python. Maybe it was Dict, isn't that kind of a tuple? Now we have 
>> Pair in 
>> > > 0.4. I do not have 0.4, maybe I should bite the bullet and install.. 
>> I'm 
>> > > not doing anything production related and trying things out and using 
>> > > 0.3[.5] to avoid stability problems.. Then I can't judge the speed.. 
>> > > 
>> > > Another potential issue I saw with tuples (maybe that is not a 
>> problem in 
>> > > general, and I do not know that languages do this) is that they can 
>> take a 
>> > > lot of memory (to copy around). I was thinking, maybe they should do 
>> > > similar to databases, only use a fixed amount of memory (a "page") 
>> with a 
>> > > pointer to overflow data.. 
>> > > 
>> > > 2015-04-30 22:13 GMT+00:00 Ali Rezaee > >: 
>> > >> They were interesting questions. 
>> > >> I would also like to know why poorly written Julia code 
>> > >> sometimes performs worse than similar python code, especially when 
>> tuples 
>> > >> are involved. Did you say it was fixed? 
>> > >> 
>> > >> On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson 
>> wrote: 
>> > >>> Hi, 
>> > >>> 
>> > >>> [As a best language is subjective, I'll put that aside for a 
>> moment.] 
>> > >>> 
>> > >>> Part I. 
>> > >>> 
>> > >>> The goal, as I understand, for Julia is at least within a factor of 
>> two 
>> > >>> of C and already matching it mostly and long term beating that (and 
>> > >>> C++). 
>> > >>> [What other goals are there? How about 0.4 now or even 1.0..?] 
>> > >>> 
>> > >>> While that is the goal as a language, you can write slow code in 
>> any 
>> > >>> language and Julia makes that easier. :) [If I recall, Bezanson 
>> > >>> mentioned 
>> > >>> it (the global "problem") as a feature, any change there?] 
>> > >>> 
>> > >>> 
>> > >>> I've been following this forum for months and newbies hit the same 
>> > >>> issues. But almost always without fail, Julia can be speed up 
>> (easily as 
>> > >>> Tim Holy says). I'm thinking about the exceptions to that - are 
>> there 
>> > >>> any 
>> > >>> left? And about the "first code slowness" (see Part II). 
>> > >>> 
>> > >>> Just recently the last two flaws of Julia that I could see where 
>> fixed: 
>> > >>> Decimal floating point is in (I'll look into the 100x slowness, 
>> that is 
>> > >>> probably to be expected of any language, still I think may be a 
>> > >>> misunderstanding and/or I can do much better). And I understand the 
>> > >>> tuple 
>> > >>> slowness has been fixed (that was really the only "core language" 
>> > >>> defect). 
>> > >>> The former wasn't a performance problem (mostly a non existence 
>> problem 
>> > >>> and 
>> > >>> correctness one (where needed)..). 
>> > >>> 
>> > >>> 
>> > >>> Still we see threads like this one recent one: 
>> > >>> 
>> > >>> https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw 
>> > >>> "It seems changing the order of nested loops also helps" 
>> > >>> 
>> > >>> Obviously Julia can't beat assembly but really C/Fortran is already 
>> > >>> close enough (within a small factor). The above row vs. column 
>> major 
>> > >>> (caching effects in general) can kill performance in all languages. 
>> > >>> Putting 
>> > >>> that newbie mistake aside, is there any reason Julia can be within 
>> a 
>> > >>> small 
>> > >>> factor of assembly (or C) in all cases already? 
>> > >>> 
>> > >>> 
>

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Tim Holy
Don't apologize; instead, tell us more about what Go does, and how you think 
things can be better. Those of us who don't know Go will thank you for it.

Best,
--Tim

On Thursday, April 30, 2015 09:42:47 PM Harry B wrote:
> Sorry my comment wasn't well thought out and a bit off topic. On
> exceptions/errors my issue is this
> https://github.com/JuliaLang/julia/issues/7026
> On profiling, I was comparing to Go, but again off topic and I take my
> comment back. I don't have any intelligent remarks to add (yet!) :)
> Thank you for the all the work you are doing.
> 
> On Thursday, April 30, 2015 at 7:00:01 PM UTC-7, Tim Holy wrote:
> > Harry, I'm curious about 2 of your 3 last points:
> > 
> > On Thursday, April 30, 2015 05:50:15 PM Harry B wrote:
> > > (exceptions?, debugging, profiling tools)
> > 
> > We have exceptions. What aspect are you referring to?
> > Debugger: yes, that's missing, and it's a huge gap.
> > Profiling tools: in my view we're doing OK (better than Matlab, in my
> > opinion),
> > but what do you see as missing?
> > 
> > --Tim
> > 
> > > Thanks
> > > 
> > > > It seemed to me tuples where slow because of Any used. I understand
> > 
> > tuples
> > 
> > > > have been fixed, I'm not sure how.
> > > > 
> > > > I do not remember the post/all the details. Yes, tuples where slow/er
> > 
> > than
> > 
> > > > Python. Maybe it was Dict, isn't that kind of a tuple? Now we have
> > 
> > Pair in
> > 
> > > > 0.4. I do not have 0.4, maybe I should bite the bullet and install..
> > 
> > I'm
> > 
> > > > not doing anything production related and trying things out and using
> > > > 0.3[.5] to avoid stability problems.. Then I can't judge the speed..
> > > > 
> > > > Another potential issue I saw with tuples (maybe that is not a problem
> > 
> > in
> > 
> > > > general, and I do not know that languages do this) is that they can
> > 
> > take a
> > 
> > > > lot of memory (to copy around). I was thinking, maybe they should do
> > > > similar to databases, only use a fixed amount of memory (a "page")
> > 
> > with a
> > 
> > > > pointer to overflow data..
> > > > 
> > > > 2015-04-30 22:13 GMT+00:00 Ali Rezaee  > 
> > >:
> > > >> They were interesting questions.
> > > >> I would also like to know why poorly written Julia code
> > > >> sometimes performs worse than similar python code, especially when
> > 
> > tuples
> > 
> > > >> are involved. Did you say it was fixed?
> > > >> 
> > > >> On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson
> > 
> > wrote:
> > > >>> Hi,
> > > >>> 
> > > >>> [As a best language is subjective, I'll put that aside for a
> > 
> > moment.]
> > 
> > > >>> Part I.
> > > >>> 
> > > >>> The goal, as I understand, for Julia is at least within a factor of
> > 
> > two
> > 
> > > >>> of C and already matching it mostly and long term beating that (and
> > > >>> C++).
> > > >>> [What other goals are there? How about 0.4 now or even 1.0..?]
> > > >>> 
> > > >>> While that is the goal as a language, you can write slow code in any
> > > >>> language and Julia makes that easier. :) [If I recall, Bezanson
> > > >>> mentioned
> > > >>> it (the global "problem") as a feature, any change there?]
> > > >>> 
> > > >>> 
> > > >>> I've been following this forum for months and newbies hit the same
> > > >>> issues. But almost always without fail, Julia can be speed up
> > 
> > (easily as
> > 
> > > >>> Tim Holy says). I'm thinking about the exceptions to that - are
> > 
> > there
> > 
> > > >>> any
> > > >>> left? And about the "first code slowness" (see Part II).
> > > >>> 
> > > >>> Just recently the last two flaws of Julia that I could see where
> > 
> > fixed:
> > > >>> Decimal floating point is in (I'll look into the 100x slowness, that
> > 
> > is
> > 
> > > >>> probably to be expected of any language, still I think may be a
> > > >>> misunderstanding and/or I can do much better). And I understand the
> > > >>> tuple
> > > >>> slowness has been fixed (that was really the only "core language"
> > > >>> defect).
> > > >>> The former wasn't a performance problem (mostly a non existence
> > 
> > problem
> > 
> > > >>> and
> > > >>> correctness one (where needed)..).
> > > >>> 
> > > >>> 
> > > >>> Still we see threads like this one recent one:
> > > >>> 
> > > >>> https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw
> > > >>> "It seems changing the order of nested loops also helps"
> > > >>> 
> > > >>> Obviously Julia can't beat assembly but really C/Fortran is already
> > > >>> close enough (within a small factor). The above row vs. column major
> > > >>> (caching effects in general) can kill performance in all languages.
> > > >>> Putting
> > > >>> that newbie mistake aside, is there any reason Julia can be within a
> > > >>> small
> > > >>> factor of assembly (or C) in all cases already?
> > > >>> 
> > > >>> 
> > > >>> Part II.
> > > >>> 
> > > >>> Except for caching issues, I still want the most newbie code or
> > > >>> intentionally brain-damaged code to run faster than at le

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Steven Sagaert
I think the performance comparisons between Julia & Python are flawed. They 
seem to be between standard Python & Julia but since Julia is all about 
scientific programming it really should be between SciPi & Julia. Since 
SciPi uses much of the same underlying libs in Fortran/C the performance 
gap will be much smaller and to be really fair it should be between numba 
compiled SciPi code & julia. I suspect the performance will be very close 
then (and close to C performance).

Similarly the standard benchmark (on the opening page of julia website) 
between R & julia is also flawed because it takes the best case scenario 
for julia (loops & mutable datastructures) & the worst case scenario for R. 
When the same R program is rewritten in vectorised style it beat julia 
see 
https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/.

So my interest in julia isn't because it is the fastest scientific high 
level language (because clearly at this stage you can't really claim that) 
but because it's a clean interesting language (still needs work for some 
rough edges of course) with clean(er) & clear(er) libraries  and that gives 
reasonable performance out of the box without much tweaking. 

On Friday, May 1, 2015 at 12:10:58 AM UTC+2, Scott Jones wrote:
>
> Yes... Python will win on string processing... esp. with Python 3... I 
> quickly ran into things that were > 800x faster in Python...
> (I hope to help change that though!)
>
> Scott
>
> On Thursday, April 30, 2015 at 6:01:45 PM UTC-4, Páll Haraldsson wrote:
>>
>> I wouldn't expect a difference in Julia for code like that (didn't 
>> check). But I guess what we are often seeing is someone comparing a tuned 
>> Python code to newbie Julia code. I still want it faster than that code.. 
>> (assuming same algorithm, note row vs. column major caveat).
>>
>> The main point of mine, *should* Python at any time win?
>>
>> 2015-04-30 21:36 GMT+00:00 Sisyphuss :
>>
>>> This post interests me. I'll write something here to follow this post.
>>>
>>> The performance gap between normal code in Python and badly-written code 
>>> in Julia is something I'd like to know too.
>>> As far as I know, Python interpret does some mysterious optimizations. 
>>> For example `(x**2)**2` is 100x faster than `x**4`.
>>>
>>>
>>>
>>>
>>> On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote:


 Hi,

 [As a best language is subjective, I'll put that aside for a moment.]

 Part I.

 The goal, as I understand, for Julia is at least within a factor of two 
 of C and already matching it mostly and long term beating that (and C++). 
 [What other goals are there? How about 0.4 now or even 1.0..?]

 While that is the goal as a language, you can write slow code in any 
 language and Julia makes that easier. :) [If I recall, Bezanson mentioned 
 it (the global "problem") as a feature, any change there?]


 I've been following this forum for months and newbies hit the same 
 issues. But almost always without fail, Julia can be speed up (easily as 
 Tim Holy says). I'm thinking about the exceptions to that - are there any 
 left? And about the "first code slowness" (see Part II).

 Just recently the last two flaws of Julia that I could see where fixed: 
 Decimal floating point is in (I'll look into the 100x slowness, that is 
 probably to be expected of any language, still I think may be a 
 misunderstanding and/or I can do much better). And I understand the tuple 
 slowness has been fixed (that was really the only "core language" defect). 
 The former wasn't a performance problem (mostly a non existence problem 
 and 
 correctness one (where needed)..).


 Still we see threads like this one recent one:

 https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw
 "It seems changing the order of nested loops also helps"

 Obviously Julia can't beat assembly but really C/Fortran is already 
 close enough (within a small factor). The above row vs. column major 
 (caching effects in general) can kill performance in all languages. 
 Putting 
 that newbie mistake aside, is there any reason Julia can be within a small 
 factor of assembly (or C) in all cases already?


 Part II.

 Except for caching issues, I still want the most newbie code or 
 intentionally brain-damaged code to run faster than at least 
 Python/scripting/interpreted languages.

 Potential problems (that I think are solved or at least not problems in 
 theory):

 1. I know Any kills performance. Still, isn't that the default in 
 Python (and Ruby, Perl?)? Is there a good reason Julia can't be faster 
 than 
 at least all the so-called scripting languages in all cases (excluding 
 small startup overhead, see below)?

 2. The global is

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-04-30 Thread Jeff Bezanson
It is true that we have not yet done enough to optimize the worst and
worse performance cases. The bright side of that is that we have room
to improve; it's not that we've run out of ideas and techniques.

Tim is right that the complexity of our dispatch system makes julia
potentially slower than python. But in dispatch-heavy code I've seen
cases where we are faster or slower; it depends.

Python's string and dictionary operations, in particular, are really
fast. This is not surprising considering what the language was
designed for, and that they have a big library of well-tuned C code
for these things.

I still maintain that it is misleading to describe an *asymptotic*
slowdown as "800x slower". If you name a constant factor, it sounds
like you're talking about a constant factor slowdown. But the number
is arbitrary, because it depends on data size. In theory, of course,
an asymptotic slowdown is *much worse* than a constant factor
slowdown. However in the systems world constant factors are often more
important, and are often what we talk about.

You say "a lot of the algorithms are O(n) instead of O(1)". Are there
any examples other than length()?

I disagree that UTF-8 has no space savings over UTF-32 when using the
full range of unicode. The reason is that strings often have only a
small percentage of non-BMP characters, with lots of spaces and
newlines etc. You don't want your whole file to use 4x the space just
to use one emoji.


On Fri, May 1, 2015 at 12:42 AM, Harry B  wrote:
> Sorry my comment wasn't well thought out and a bit off topic. On
> exceptions/errors my issue is this
> https://github.com/JuliaLang/julia/issues/7026
> On profiling, I was comparing to Go, but again off topic and I take my
> comment back. I don't have any intelligent remarks to add (yet!) :)
> Thank you for the all the work you are doing.
>
> On Thursday, April 30, 2015 at 7:00:01 PM UTC-7, Tim Holy wrote:
>>
>> Harry, I'm curious about 2 of your 3 last points:
>>
>> On Thursday, April 30, 2015 05:50:15 PM Harry B wrote:
>> > (exceptions?, debugging, profiling tools)
>>
>> We have exceptions. What aspect are you referring to?
>> Debugger: yes, that's missing, and it's a huge gap.
>> Profiling tools: in my view we're doing OK (better than Matlab, in my
>> opinion),
>> but what do you see as missing?
>>
>> --Tim
>>
>> >
>> > Thanks
>> > --
>> > Harry
>> >
>> > On Thursday, April 30, 2015 at 3:43:36 PM UTC-7, Páll Haraldsson wrote:
>> > > It seemed to me tuples where slow because of Any used. I understand
>> > > tuples
>> > > have been fixed, I'm not sure how.
>> > >
>> > > I do not remember the post/all the details. Yes, tuples where slow/er
>> > > than
>> > > Python. Maybe it was Dict, isn't that kind of a tuple? Now we have
>> > > Pair in
>> > > 0.4. I do not have 0.4, maybe I should bite the bullet and install..
>> > > I'm
>> > > not doing anything production related and trying things out and using
>> > > 0.3[.5] to avoid stability problems.. Then I can't judge the speed..
>> > >
>> > > Another potential issue I saw with tuples (maybe that is not a problem
>> > > in
>> > > general, and I do not know that languages do this) is that they can
>> > > take a
>> > > lot of memory (to copy around). I was thinking, maybe they should do
>> > > similar to databases, only use a fixed amount of memory (a "page")
>> > > with a
>> > > pointer to overflow data..
>> > >
>> > > 2015-04-30 22:13 GMT+00:00 Ali Rezaee > > > >:
>> > >> They were interesting questions.
>> > >> I would also like to know why poorly written Julia code
>> > >> sometimes performs worse than similar python code, especially when
>> > >> tuples
>> > >> are involved. Did you say it was fixed?
>> > >>
>> > >> On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson
>> > >> wrote:
>> > >>> Hi,
>> > >>>
>> > >>> [As a best language is subjective, I'll put that aside for a
>> > >>> moment.]
>> > >>>
>> > >>> Part I.
>> > >>>
>> > >>> The goal, as I understand, for Julia is at least within a factor of
>> > >>> two
>> > >>> of C and already matching it mostly and long term beating that (and
>> > >>> C++).
>> > >>> [What other goals are there? How about 0.4 now or even 1.0..?]
>> > >>>
>> > >>> While that is the goal as a language, you can write slow code in any
>> > >>> language and Julia makes that easier. :) [If I recall, Bezanson
>> > >>> mentioned
>> > >>> it (the global "problem") as a feature, any change there?]
>> > >>>
>> > >>>
>> > >>> I've been following this forum for months and newbies hit the same
>> > >>> issues. But almost always without fail, Julia can be speed up
>> > >>> (easily as
>> > >>> Tim Holy says). I'm thinking about the exceptions to that - are
>> > >>> there
>> > >>> any
>> > >>> left? And about the "first code slowness" (see Part II).
>> > >>>
>> > >>> Just recently the last two flaws of Julia that I could see where
>> > >>> fixed:
>> > >>> Decimal floating point is in (I'll look into the 100x slowness, that
>> > >>> is
>> > >>> probably 

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-04-30 Thread Harry B
Sorry my comment wasn't well thought out and a bit off topic. On 
exceptions/errors my issue is this 
https://github.com/JuliaLang/julia/issues/7026
On profiling, I was comparing to Go, but again off topic and I take my 
comment back. I don't have any intelligent remarks to add (yet!) :)
Thank you for the all the work you are doing. 

On Thursday, April 30, 2015 at 7:00:01 PM UTC-7, Tim Holy wrote:
>
> Harry, I'm curious about 2 of your 3 last points: 
>
> On Thursday, April 30, 2015 05:50:15 PM Harry B wrote: 
> > (exceptions?, debugging, profiling tools) 
>
> We have exceptions. What aspect are you referring to? 
> Debugger: yes, that's missing, and it's a huge gap. 
> Profiling tools: in my view we're doing OK (better than Matlab, in my 
> opinion), 
> but what do you see as missing? 
>
> --Tim 
>
> > 
> > Thanks 
> > -- 
> > Harry 
> > 
> > On Thursday, April 30, 2015 at 3:43:36 PM UTC-7, Páll Haraldsson wrote: 
> > > It seemed to me tuples where slow because of Any used. I understand 
> tuples 
> > > have been fixed, I'm not sure how. 
> > > 
> > > I do not remember the post/all the details. Yes, tuples where slow/er 
> than 
> > > Python. Maybe it was Dict, isn't that kind of a tuple? Now we have 
> Pair in 
> > > 0.4. I do not have 0.4, maybe I should bite the bullet and install.. 
> I'm 
> > > not doing anything production related and trying things out and using 
> > > 0.3[.5] to avoid stability problems.. Then I can't judge the speed.. 
> > > 
> > > Another potential issue I saw with tuples (maybe that is not a problem 
> in 
> > > general, and I do not know that languages do this) is that they can 
> take a 
> > > lot of memory (to copy around). I was thinking, maybe they should do 
> > > similar to databases, only use a fixed amount of memory (a "page") 
> with a 
> > > pointer to overflow data.. 
> > > 
> > > 2015-04-30 22:13 GMT+00:00 Ali Rezaee  >: 
> > >> They were interesting questions. 
> > >> I would also like to know why poorly written Julia code 
> > >> sometimes performs worse than similar python code, especially when 
> tuples 
> > >> are involved. Did you say it was fixed? 
> > >> 
> > >> On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson 
> wrote: 
> > >>> Hi, 
> > >>> 
> > >>> [As a best language is subjective, I'll put that aside for a 
> moment.] 
> > >>> 
> > >>> Part I. 
> > >>> 
> > >>> The goal, as I understand, for Julia is at least within a factor of 
> two 
> > >>> of C and already matching it mostly and long term beating that (and 
> > >>> C++). 
> > >>> [What other goals are there? How about 0.4 now or even 1.0..?] 
> > >>> 
> > >>> While that is the goal as a language, you can write slow code in any 
> > >>> language and Julia makes that easier. :) [If I recall, Bezanson 
> > >>> mentioned 
> > >>> it (the global "problem") as a feature, any change there?] 
> > >>> 
> > >>> 
> > >>> I've been following this forum for months and newbies hit the same 
> > >>> issues. But almost always without fail, Julia can be speed up 
> (easily as 
> > >>> Tim Holy says). I'm thinking about the exceptions to that - are 
> there 
> > >>> any 
> > >>> left? And about the "first code slowness" (see Part II). 
> > >>> 
> > >>> Just recently the last two flaws of Julia that I could see where 
> fixed: 
> > >>> Decimal floating point is in (I'll look into the 100x slowness, that 
> is 
> > >>> probably to be expected of any language, still I think may be a 
> > >>> misunderstanding and/or I can do much better). And I understand the 
> > >>> tuple 
> > >>> slowness has been fixed (that was really the only "core language" 
> > >>> defect). 
> > >>> The former wasn't a performance problem (mostly a non existence 
> problem 
> > >>> and 
> > >>> correctness one (where needed)..). 
> > >>> 
> > >>> 
> > >>> Still we see threads like this one recent one: 
> > >>> 
> > >>> https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw 
> > >>> "It seems changing the order of nested loops also helps" 
> > >>> 
> > >>> Obviously Julia can't beat assembly but really C/Fortran is already 
> > >>> close enough (within a small factor). The above row vs. column major 
> > >>> (caching effects in general) can kill performance in all languages. 
> > >>> Putting 
> > >>> that newbie mistake aside, is there any reason Julia can be within a 
> > >>> small 
> > >>> factor of assembly (or C) in all cases already? 
> > >>> 
> > >>> 
> > >>> Part II. 
> > >>> 
> > >>> Except for caching issues, I still want the most newbie code or 
> > >>> intentionally brain-damaged code to run faster than at least 
> > >>> Python/scripting/interpreted languages. 
> > >>> 
> > >>> Potential problems (that I think are solved or at least not problems 
> in 
> > >>> theory): 
> > >>> 
> > >>> 1. I know Any kills performance. Still, isn't that the default in 
> Python 
> > >>> (and Ruby, Perl?)? Is there a good reason Julia can't be faster than 
> at 
> > >>> least all the so-called scripting languages in all cases (ex

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-04-30 Thread Scott Jones

> On Apr 30, 2015, at 9:58 PM, Tim Holy  wrote:
> 
> Strings have long been a performance sore-spot in julia, so we're glad Scott 
> is hammering on that topic.

Thanks, Tim!  I was beginning to think I’d be banned from all Julia forums, for 
being a thorn in the side of
the Julia developers…
(I do want to say again… if I didn’t think what all of you had created wasn’t 
incredibly great, I wouldn’t be so interested
in making it even greater, in the particular areas I know a little about…
Also, the issues I’ve found are not because the developers aren’t brilliant 
[I’ve been super impressed, and I don’t impress
that easily!], but rather, either it’s outside of their area of expertise [as 
the numerical computing stuff is outside mine], or they
are incredibly busy making great strides in the areas that they are more 
interested in…)

> For "interpreted" code (including Julia with Any types), it's very possible 
> that Python is and will remain faster. For one thing, Python is single-
> dispatch, which means that when the interpreter has to go look up the 
> function 
> corresponding to your next expression, typically the list of applicable 
> methods is quite short. In contrast, julia sometimes has to sort through huge 
> method tables to determine the appropriate one to dispatch to. Multiple 
> dispatch adds a lot of power to the language, and there's no performance cost 
> for code that has been compiled, but it does make interpreted code slower.

Good point…

Scott

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-04-30 Thread Tim Holy
Harry, I'm curious about 2 of your 3 last points:

On Thursday, April 30, 2015 05:50:15 PM Harry B wrote:
> (exceptions?, debugging, profiling tools)

We have exceptions. What aspect are you referring to?
Debugger: yes, that's missing, and it's a huge gap.
Profiling tools: in my view we're doing OK (better than Matlab, in my opinion), 
but what do you see as missing? 

--Tim

> 
> Thanks
> --
> Harry
> 
> On Thursday, April 30, 2015 at 3:43:36 PM UTC-7, Páll Haraldsson wrote:
> > It seemed to me tuples where slow because of Any used. I understand tuples
> > have been fixed, I'm not sure how.
> > 
> > I do not remember the post/all the details. Yes, tuples where slow/er than
> > Python. Maybe it was Dict, isn't that kind of a tuple? Now we have Pair in
> > 0.4. I do not have 0.4, maybe I should bite the bullet and install.. I'm
> > not doing anything production related and trying things out and using
> > 0.3[.5] to avoid stability problems.. Then I can't judge the speed..
> > 
> > Another potential issue I saw with tuples (maybe that is not a problem in
> > general, and I do not know that languages do this) is that they can take a
> > lot of memory (to copy around). I was thinking, maybe they should do
> > similar to databases, only use a fixed amount of memory (a "page") with a
> > pointer to overflow data..
> > 
> > 2015-04-30 22:13 GMT+00:00 Ali Rezaee >:
> >> They were interesting questions.
> >> I would also like to know why poorly written Julia code
> >> sometimes performs worse than similar python code, especially when tuples
> >> are involved. Did you say it was fixed?
> >> 
> >> On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote:
> >>> Hi,
> >>> 
> >>> [As a best language is subjective, I'll put that aside for a moment.]
> >>> 
> >>> Part I.
> >>> 
> >>> The goal, as I understand, for Julia is at least within a factor of two
> >>> of C and already matching it mostly and long term beating that (and
> >>> C++).
> >>> [What other goals are there? How about 0.4 now or even 1.0..?]
> >>> 
> >>> While that is the goal as a language, you can write slow code in any
> >>> language and Julia makes that easier. :) [If I recall, Bezanson
> >>> mentioned
> >>> it (the global "problem") as a feature, any change there?]
> >>> 
> >>> 
> >>> I've been following this forum for months and newbies hit the same
> >>> issues. But almost always without fail, Julia can be speed up (easily as
> >>> Tim Holy says). I'm thinking about the exceptions to that - are there
> >>> any
> >>> left? And about the "first code slowness" (see Part II).
> >>> 
> >>> Just recently the last two flaws of Julia that I could see where fixed:
> >>> Decimal floating point is in (I'll look into the 100x slowness, that is
> >>> probably to be expected of any language, still I think may be a
> >>> misunderstanding and/or I can do much better). And I understand the
> >>> tuple
> >>> slowness has been fixed (that was really the only "core language"
> >>> defect).
> >>> The former wasn't a performance problem (mostly a non existence problem
> >>> and
> >>> correctness one (where needed)..).
> >>> 
> >>> 
> >>> Still we see threads like this one recent one:
> >>> 
> >>> https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw
> >>> "It seems changing the order of nested loops also helps"
> >>> 
> >>> Obviously Julia can't beat assembly but really C/Fortran is already
> >>> close enough (within a small factor). The above row vs. column major
> >>> (caching effects in general) can kill performance in all languages.
> >>> Putting
> >>> that newbie mistake aside, is there any reason Julia can be within a
> >>> small
> >>> factor of assembly (or C) in all cases already?
> >>> 
> >>> 
> >>> Part II.
> >>> 
> >>> Except for caching issues, I still want the most newbie code or
> >>> intentionally brain-damaged code to run faster than at least
> >>> Python/scripting/interpreted languages.
> >>> 
> >>> Potential problems (that I think are solved or at least not problems in
> >>> theory):
> >>> 
> >>> 1. I know Any kills performance. Still, isn't that the default in Python
> >>> (and Ruby, Perl?)? Is there a good reason Julia can't be faster than at
> >>> least all the so-called scripting languages in all cases (excluding
> >>> small
> >>> startup overhead, see below)?
> >>> 
> >>> 2. The global issue, not sure if that slows other languages down, say
> >>> Python. Even if it doesn't, should Julia be slower than Python because
> >>> of
> >>> global?
> >>> 
> >>> 3. Garbage collection. I do not see that as a problem, incorrect? Mostly
> >>> performance variability ("[3D] games" - subject for another post, as I'm
> >>> not sure that is even a problem in theory..). Should reference counting
> >>> (Python) be faster? On the contrary, I think RC and even manual memory
> >>> management could be slower.
> >>> 
> >>> 4. Concurrency, see nr. 3. GC may or may not have an issue with it. It
> >>> can be a problem, what about in Julia? There are con

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-04-30 Thread Tim Holy
Strings have long been a performance sore-spot in julia, so we're glad Scott 
is hammering on that topic.

For "interpreted" code (including Julia with Any types), it's very possible 
that Python is and will remain faster. For one thing, Python is single-
dispatch, which means that when the interpreter has to go look up the function 
corresponding to your next expression, typically the list of applicable 
methods is quite short. In contrast, julia sometimes has to sort through huge 
method tables to determine the appropriate one to dispatch to. Multiple 
dispatch adds a lot of power to the language, and there's no performance cost 
for code that has been compiled, but it does make interpreted code slower.

Best,
--Tim

On Thursday, April 30, 2015 10:34:20 PM Páll Haraldsson wrote:
> Interesting.. does that mean Unicode then that is esp. faster or something
> else?
> 
> >800x faster is way worse than I thought and no good reason for it..
> 
> I'm really intrigued what is this slow, can't be the simple things like say
> just string concatenation?!
> 
> You can get similar speed using PyCall.jl :)
> 
> For some obscure function like Levenshtein distance I might expect this (or
> not implemented yet in Julia) as Python would use tuned C code or in any
> function where you need to do non-trivial work per function-call.
> 
> 
> I failed to add regex to the list as an example as I was pretty sure it was
> as fast (or faster, because of macros) as Perl as it is using the same
> library.
> 
> Similarly for all Unicode/UTF-8 stuff I was not expecting slowness. I know
> the work on that in Python2/3 and expected Julia could/did similar.
> 
> 2015-04-30 22:10 GMT+00:00 Scott Jones :
> > Yes... Python will win on string processing... esp. with Python 3... I
> > quickly ran into things that were > 800x faster in Python...
> > (I hope to help change that though!)
> > 
> > Scott
> > 
> > On Thursday, April 30, 2015 at 6:01:45 PM UTC-4, Páll Haraldsson wrote:
> >> I wouldn't expect a difference in Julia for code like that (didn't
> >> check). But I guess what we are often seeing is someone comparing a tuned
> >> Python code to newbie Julia code. I still want it faster than that code..
> >> (assuming same algorithm, note row vs. column major caveat).
> >> 
> >> The main point of mine, *should* Python at any time win?
> >> 
> >> 2015-04-30 21:36 GMT+00:00 Sisyphuss :
> >>> This post interests me. I'll write something here to follow this post.
> >>> 
> >>> The performance gap between normal code in Python and badly-written code
> >>> in Julia is something I'd like to know too.
> >>> As far as I know, Python interpret does some mysterious optimizations.
> >>> For example `(x**2)**2` is 100x faster than `x**4`.
> >>> 
> >>> On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote:
>  Hi,
>  
>  [As a best language is subjective, I'll put that aside for a moment.]
>  
>  Part I.
>  
>  The goal, as I understand, for Julia is at least within a factor of two
>  of C and already matching it mostly and long term beating that (and
>  C++).
>  [What other goals are there? How about 0.4 now or even 1.0..?]
>  
>  While that is the goal as a language, you can write slow code in any
>  language and Julia makes that easier. :) [If I recall, Bezanson
>  mentioned
>  it (the global "problem") as a feature, any change there?]
>  
>  
>  I've been following this forum for months and newbies hit the same
>  issues. But almost always without fail, Julia can be speed up (easily
>  as
>  Tim Holy says). I'm thinking about the exceptions to that - are there
>  any
>  left? And about the "first code slowness" (see Part II).
>  
>  Just recently the last two flaws of Julia that I could see where fixed:
>  Decimal floating point is in (I'll look into the 100x slowness, that is
>  probably to be expected of any language, still I think may be a
>  misunderstanding and/or I can do much better). And I understand the
>  tuple
>  slowness has been fixed (that was really the only "core language"
>  defect).
>  The former wasn't a performance problem (mostly a non existence problem
>  and
>  correctness one (where needed)..).
>  
>  
>  Still we see threads like this one recent one:
>  
>  https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw
>  "It seems changing the order of nested loops also helps"
>  
>  Obviously Julia can't beat assembly but really C/Fortran is already
>  close enough (within a small factor). The above row vs. column major
>  (caching effects in general) can kill performance in all languages.
>  Putting
>  that newbie mistake aside, is there any reason Julia can be within a
>  small
>  factor of assembly (or C) in all cases already?
>  
Received server disconnect: b0 'Idle Timeout'

>  
>  Part II.
>  
> >>>

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-04-30 Thread Scott Jones


On Thursday, April 30, 2015 at 6:34:23 PM UTC-4, Páll Haraldsson wrote:
>
> Interesting.. does that mean Unicode then that is esp. faster or something 
> else?
>
> >800x faster is way worse than I thought and no good reason for it..
>

That particular case is because CPython (which is the standard C 
implementation of Python, what most people mean when they use Python), has 
optimized the case of

var += string

which is appending to a variable.

Although strings *are* immutable in Python, as in Julia, Python detects 
that you are replacing a string with the string concatenated with another, 
and if
nobody else has a reference to the string in that variable, it can simply 
update the string in place, and otherwise, it makes a new string big enough 
for the result,
and sets the variable to that new string.
 

> I'm really intrigued what is this slow, can't be the simple things like 
> say just string concatenation?!
>
> You can get similar speed using PyCall.jl :)
>

I'm not so sure... I don't really think so - because you still have to move 
the string from Julia (which uses either ASCII or UTF-8 for strings by 
default, you have to specifically
convert them to get UTF-16 or UTF-32...) to Python, and then back... and 
Julia's string conversions are rather slow... O(n^2) in most cases...
(I'm working in improving that, I hope I can get my changes accepted into 
Julia's Base)

For some obscure function like Levenshtein distance I might expect this (or 
> not implemented yet in Julia) as Python would use tuned C code or in any 
> function where you need to do non-trivial work per function-call.
>
>
> I failed to add regex to the list as an example as I was pretty sure it 
> was as fast (or faster, because of macros) as Perl as it is using the same 
> library.
>
> Similarly for all Unicode/UTF-8 stuff I was not expecting slowness. I know 
> the work on that in Python2/3 and expected Julia could/did similar.
>

No, a lot of the algorithms are O(n) instead of O(1), because of the 
decision to use UTF-8...
I'd like to convince the core team to change Julia to do what Python 3 does.
UTF-8 is pretty bad to use for internal string representation (where it 
shines is an an interchange format).
UTF-8 can take up to 50% more storage than UTF-16 if you are just dealing 
with BMP characters.
If you have some field that needs to hold a certain number of Unicode 
characters, for the full range of Unicode,
you need to allocate 4 bytes for every character, so no savings compared to 
UTF-16 or UTF-32.

Python 3 internally stores strings as either: 7-bit (ASCII), 8-bit (ANSI 
Latin1, only characters < 0x100 present), 16-bit (UCS-2, i.e. there are no 
non-BMP characters present),
or 32-bit (UTF-32).  You might wonder why there is a special distinction 
between 7-bit ASCII and 8-bit ANSI Latin 1... they are both Unicode 
subsets, but 7-bit ASCII
can also be used directly without conversion as UTF-8.
All internal formats are directly addressable (unlike Julia's UTF8String 
and UTF16String), and the conversions between the 4 internal types is very 
fast, simple
widening (or a no-op, as in the case of ASCII -> ANSI), when going from 
smaller to larger.

Julia also has a big problem with always wanting to have a terminating \0 
byte or word, which means that you can't take a substring or slice of 
another string without
making a copy to be able to add that terminating \0 (so lots of extra 
memory allocation and garbage collection for common algorithms).

I hope that makes things a bit clearer!

Scott


Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-04-30 Thread Harry B
a newbie comment:  If it can be made a bit more easier to write code that 
uses all the cores ( I am comparing to Go with its channels), it probably 
doesn't need to be faster than Python. 

>From an outsider's perspective, @everywhere is inconvenient. pmap etc 
doesn't cover nearly as many cases as Go channels. May be it is 
documentation problem.

I wouldn't think it would be good to try to extract every last bit of speed 
when you are 0.4.. there are so many things to cleanup/build in the 
language and standard library (exceptions?, debugging, profiling tools)

Thanks
--
Harry

On Thursday, April 30, 2015 at 3:43:36 PM UTC-7, Páll Haraldsson wrote:
>
> It seemed to me tuples where slow because of Any used. I understand tuples 
> have been fixed, I'm not sure how.
>
> I do not remember the post/all the details. Yes, tuples where slow/er than 
> Python. Maybe it was Dict, isn't that kind of a tuple? Now we have Pair in 
> 0.4. I do not have 0.4, maybe I should bite the bullet and install.. I'm 
> not doing anything production related and trying things out and using 
> 0.3[.5] to avoid stability problems.. Then I can't judge the speed..
>
> Another potential issue I saw with tuples (maybe that is not a problem in 
> general, and I do not know that languages do this) is that they can take a 
> lot of memory (to copy around). I was thinking, maybe they should do 
> similar to databases, only use a fixed amount of memory (a "page") with a 
> pointer to overflow data..
>
> 2015-04-30 22:13 GMT+00:00 Ali Rezaee >:
>
>> They were interesting questions.
>> I would also like to know why poorly written Julia code 
>> sometimes performs worse than similar python code, especially when tuples 
>> are involved. Did you say it was fixed?
>>
>> On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote:
>>
>>>
>>> Hi,
>>>
>>> [As a best language is subjective, I'll put that aside for a moment.]
>>>
>>> Part I.
>>>
>>> The goal, as I understand, for Julia is at least within a factor of two 
>>> of C and already matching it mostly and long term beating that (and C++). 
>>> [What other goals are there? How about 0.4 now or even 1.0..?]
>>>
>>> While that is the goal as a language, you can write slow code in any 
>>> language and Julia makes that easier. :) [If I recall, Bezanson mentioned 
>>> it (the global "problem") as a feature, any change there?]
>>>
>>>
>>> I've been following this forum for months and newbies hit the same 
>>> issues. But almost always without fail, Julia can be speed up (easily as 
>>> Tim Holy says). I'm thinking about the exceptions to that - are there any 
>>> left? And about the "first code slowness" (see Part II).
>>>
>>> Just recently the last two flaws of Julia that I could see where fixed: 
>>> Decimal floating point is in (I'll look into the 100x slowness, that is 
>>> probably to be expected of any language, still I think may be a 
>>> misunderstanding and/or I can do much better). And I understand the tuple 
>>> slowness has been fixed (that was really the only "core language" defect). 
>>> The former wasn't a performance problem (mostly a non existence problem and 
>>> correctness one (where needed)..).
>>>
>>>
>>> Still we see threads like this one recent one:
>>>
>>> https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw
>>> "It seems changing the order of nested loops also helps"
>>>
>>> Obviously Julia can't beat assembly but really C/Fortran is already 
>>> close enough (within a small factor). The above row vs. column major 
>>> (caching effects in general) can kill performance in all languages. Putting 
>>> that newbie mistake aside, is there any reason Julia can be within a small 
>>> factor of assembly (or C) in all cases already?
>>>
>>>
>>> Part II.
>>>
>>> Except for caching issues, I still want the most newbie code or 
>>> intentionally brain-damaged code to run faster than at least 
>>> Python/scripting/interpreted languages.
>>>
>>> Potential problems (that I think are solved or at least not problems in 
>>> theory):
>>>
>>> 1. I know Any kills performance. Still, isn't that the default in Python 
>>> (and Ruby, Perl?)? Is there a good reason Julia can't be faster than at 
>>> least all the so-called scripting languages in all cases (excluding small 
>>> startup overhead, see below)?
>>>
>>> 2. The global issue, not sure if that slows other languages down, say 
>>> Python. Even if it doesn't, should Julia be slower than Python because of 
>>> global?
>>>
>>> 3. Garbage collection. I do not see that as a problem, incorrect? Mostly 
>>> performance variability ("[3D] games" - subject for another post, as I'm 
>>> not sure that is even a problem in theory..). Should reference counting 
>>> (Python) be faster? On the contrary, I think RC and even manual memory 
>>> management could be slower.
>>>
>>> 4. Concurrency, see nr. 3. GC may or may not have an issue with it. It 
>>> can be a problem, what about in Julia? There are concurrent GC algorithms 
>>> and

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-04-30 Thread Páll Haraldsson
It seemed to me tuples where slow because of Any used. I understand tuples
have been fixed, I'm not sure how.

I do not remember the post/all the details. Yes, tuples where slow/er than
Python. Maybe it was Dict, isn't that kind of a tuple? Now we have Pair in
0.4. I do not have 0.4, maybe I should bite the bullet and install.. I'm
not doing anything production related and trying things out and using
0.3[.5] to avoid stability problems.. Then I can't judge the speed..

Another potential issue I saw with tuples (maybe that is not a problem in
general, and I do not know that languages do this) is that they can take a
lot of memory (to copy around). I was thinking, maybe they should do
similar to databases, only use a fixed amount of memory (a "page") with a
pointer to overflow data..

2015-04-30 22:13 GMT+00:00 Ali Rezaee :

> They were interesting questions.
> I would also like to know why poorly written Julia code sometimes performs
> worse than similar python code, especially when tuples are involved. Did
> you say it was fixed?
>
> On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote:
>
>>
>> Hi,
>>
>> [As a best language is subjective, I'll put that aside for a moment.]
>>
>> Part I.
>>
>> The goal, as I understand, for Julia is at least within a factor of two
>> of C and already matching it mostly and long term beating that (and C++).
>> [What other goals are there? How about 0.4 now or even 1.0..?]
>>
>> While that is the goal as a language, you can write slow code in any
>> language and Julia makes that easier. :) [If I recall, Bezanson mentioned
>> it (the global "problem") as a feature, any change there?]
>>
>>
>> I've been following this forum for months and newbies hit the same
>> issues. But almost always without fail, Julia can be speed up (easily as
>> Tim Holy says). I'm thinking about the exceptions to that - are there any
>> left? And about the "first code slowness" (see Part II).
>>
>> Just recently the last two flaws of Julia that I could see where fixed:
>> Decimal floating point is in (I'll look into the 100x slowness, that is
>> probably to be expected of any language, still I think may be a
>> misunderstanding and/or I can do much better). And I understand the tuple
>> slowness has been fixed (that was really the only "core language" defect).
>> The former wasn't a performance problem (mostly a non existence problem and
>> correctness one (where needed)..).
>>
>>
>> Still we see threads like this one recent one:
>>
>> https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw
>> "It seems changing the order of nested loops also helps"
>>
>> Obviously Julia can't beat assembly but really C/Fortran is already close
>> enough (within a small factor). The above row vs. column major (caching
>> effects in general) can kill performance in all languages. Putting that
>> newbie mistake aside, is there any reason Julia can be within a small
>> factor of assembly (or C) in all cases already?
>>
>>
>> Part II.
>>
>> Except for caching issues, I still want the most newbie code or
>> intentionally brain-damaged code to run faster than at least
>> Python/scripting/interpreted languages.
>>
>> Potential problems (that I think are solved or at least not problems in
>> theory):
>>
>> 1. I know Any kills performance. Still, isn't that the default in Python
>> (and Ruby, Perl?)? Is there a good reason Julia can't be faster than at
>> least all the so-called scripting languages in all cases (excluding small
>> startup overhead, see below)?
>>
>> 2. The global issue, not sure if that slows other languages down, say
>> Python. Even if it doesn't, should Julia be slower than Python because of
>> global?
>>
>> 3. Garbage collection. I do not see that as a problem, incorrect? Mostly
>> performance variability ("[3D] games" - subject for another post, as I'm
>> not sure that is even a problem in theory..). Should reference counting
>> (Python) be faster? On the contrary, I think RC and even manual memory
>> management could be slower.
>>
>> 4. Concurrency, see nr. 3. GC may or may not have an issue with it. It
>> can be a problem, what about in Julia? There are concurrent GC algorithms
>> and/or real-time (just not in Julia). Other than GC is there any big
>> (potential) problem for concurrent/parallel? I know about the threads work
>> and new GC in 0.4.
>>
>> 5. Subarrays ("array slicing"?). Not really what I consider a problem,
>> compared to say C (and Python?). I know 0.4 did optimize it, but what
>> languages do similar stuff? Functional ones?
>>
>> 6. In theory, pure functional languages "should" be faster. Are they in
>> practice in many or any case? Julia has non-mutable state if needed but
>> maybe not as powerful? This seems a double-edged sword. I think Julia
>> designers intentionally chose mutable state to conserve memory. Pros and
>> cons? Mostly Pros for Julia?
>>
>> 7. Startup time. Python is faster and for say web use, or compared to PHP
>> could be an issue, but would b

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-04-30 Thread Páll Haraldsson
Interesting.. does that mean Unicode then that is esp. faster or something
else?

>800x faster is way worse than I thought and no good reason for it..

I'm really intrigued what is this slow, can't be the simple things like say
just string concatenation?!

You can get similar speed using PyCall.jl :)

For some obscure function like Levenshtein distance I might expect this (or
not implemented yet in Julia) as Python would use tuned C code or in any
function where you need to do non-trivial work per function-call.


I failed to add regex to the list as an example as I was pretty sure it was
as fast (or faster, because of macros) as Perl as it is using the same
library.

Similarly for all Unicode/UTF-8 stuff I was not expecting slowness. I know
the work on that in Python2/3 and expected Julia could/did similar.


2015-04-30 22:10 GMT+00:00 Scott Jones :

> Yes... Python will win on string processing... esp. with Python 3... I
> quickly ran into things that were > 800x faster in Python...
> (I hope to help change that though!)
>
> Scott
>
> On Thursday, April 30, 2015 at 6:01:45 PM UTC-4, Páll Haraldsson wrote:
>>
>> I wouldn't expect a difference in Julia for code like that (didn't
>> check). But I guess what we are often seeing is someone comparing a tuned
>> Python code to newbie Julia code. I still want it faster than that code..
>> (assuming same algorithm, note row vs. column major caveat).
>>
>> The main point of mine, *should* Python at any time win?
>>
>> 2015-04-30 21:36 GMT+00:00 Sisyphuss :
>>
>>> This post interests me. I'll write something here to follow this post.
>>>
>>> The performance gap between normal code in Python and badly-written code
>>> in Julia is something I'd like to know too.
>>> As far as I know, Python interpret does some mysterious optimizations.
>>> For example `(x**2)**2` is 100x faster than `x**4`.
>>>
>>>
>>>
>>>
>>> On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote:


 Hi,

 [As a best language is subjective, I'll put that aside for a moment.]

 Part I.

 The goal, as I understand, for Julia is at least within a factor of two
 of C and already matching it mostly and long term beating that (and C++).
 [What other goals are there? How about 0.4 now or even 1.0..?]

 While that is the goal as a language, you can write slow code in any
 language and Julia makes that easier. :) [If I recall, Bezanson mentioned
 it (the global "problem") as a feature, any change there?]


 I've been following this forum for months and newbies hit the same
 issues. But almost always without fail, Julia can be speed up (easily as
 Tim Holy says). I'm thinking about the exceptions to that - are there any
 left? And about the "first code slowness" (see Part II).

 Just recently the last two flaws of Julia that I could see where fixed:
 Decimal floating point is in (I'll look into the 100x slowness, that is
 probably to be expected of any language, still I think may be a
 misunderstanding and/or I can do much better). And I understand the tuple
 slowness has been fixed (that was really the only "core language" defect).
 The former wasn't a performance problem (mostly a non existence problem and
 correctness one (where needed)..).


 Still we see threads like this one recent one:

 https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw
 "It seems changing the order of nested loops also helps"

 Obviously Julia can't beat assembly but really C/Fortran is already
 close enough (within a small factor). The above row vs. column major
 (caching effects in general) can kill performance in all languages. Putting
 that newbie mistake aside, is there any reason Julia can be within a small
 factor of assembly (or C) in all cases already?


 Part II.

 Except for caching issues, I still want the most newbie code or
 intentionally brain-damaged code to run faster than at least
 Python/scripting/interpreted languages.

 Potential problems (that I think are solved or at least not problems in
 theory):

 1. I know Any kills performance. Still, isn't that the default in
 Python (and Ruby, Perl?)? Is there a good reason Julia can't be faster than
 at least all the so-called scripting languages in all cases (excluding
 small startup overhead, see below)?

 2. The global issue, not sure if that slows other languages down, say
 Python. Even if it doesn't, should Julia be slower than Python because of
 global?

 3. Garbage collection. I do not see that as a problem, incorrect?
 Mostly performance variability ("[3D] games" - subject for another post, as
 I'm not sure that is even a problem in theory..). Should reference counting
 (Python) be faster? On the contrary, I think RC and even manual memory
 management could be slower.

>

[julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-04-30 Thread Ali Rezaee
They were interesting questions.
I would also like to know why poorly written Julia code sometimes performs 
worse than similar python code, especially when tuples are involved. Did 
you say it was fixed?

On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote:

>
> Hi,
>
> [As a best language is subjective, I'll put that aside for a moment.]
>
> Part I.
>
> The goal, as I understand, for Julia is at least within a factor of two of 
> C and already matching it mostly and long term beating that (and C++). 
> [What other goals are there? How about 0.4 now or even 1.0..?]
>
> While that is the goal as a language, you can write slow code in any 
> language and Julia makes that easier. :) [If I recall, Bezanson mentioned 
> it (the global "problem") as a feature, any change there?]
>
>
> I've been following this forum for months and newbies hit the same issues. 
> But almost always without fail, Julia can be speed up (easily as Tim Holy 
> says). I'm thinking about the exceptions to that - are there any left? And 
> about the "first code slowness" (see Part II).
>
> Just recently the last two flaws of Julia that I could see where fixed: 
> Decimal floating point is in (I'll look into the 100x slowness, that is 
> probably to be expected of any language, still I think may be a 
> misunderstanding and/or I can do much better). And I understand the tuple 
> slowness has been fixed (that was really the only "core language" defect). 
> The former wasn't a performance problem (mostly a non existence problem and 
> correctness one (where needed)..).
>
>
> Still we see threads like this one recent one:
>
> https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw
> "It seems changing the order of nested loops also helps"
>
> Obviously Julia can't beat assembly but really C/Fortran is already close 
> enough (within a small factor). The above row vs. column major (caching 
> effects in general) can kill performance in all languages. Putting that 
> newbie mistake aside, is there any reason Julia can be within a small 
> factor of assembly (or C) in all cases already?
>
>
> Part II.
>
> Except for caching issues, I still want the most newbie code or 
> intentionally brain-damaged code to run faster than at least 
> Python/scripting/interpreted languages.
>
> Potential problems (that I think are solved or at least not problems in 
> theory):
>
> 1. I know Any kills performance. Still, isn't that the default in Python 
> (and Ruby, Perl?)? Is there a good reason Julia can't be faster than at 
> least all the so-called scripting languages in all cases (excluding small 
> startup overhead, see below)?
>
> 2. The global issue, not sure if that slows other languages down, say 
> Python. Even if it doesn't, should Julia be slower than Python because of 
> global?
>
> 3. Garbage collection. I do not see that as a problem, incorrect? Mostly 
> performance variability ("[3D] games" - subject for another post, as I'm 
> not sure that is even a problem in theory..). Should reference counting 
> (Python) be faster? On the contrary, I think RC and even manual memory 
> management could be slower.
>
> 4. Concurrency, see nr. 3. GC may or may not have an issue with it. It can 
> be a problem, what about in Julia? There are concurrent GC algorithms 
> and/or real-time (just not in Julia). Other than GC is there any big 
> (potential) problem for concurrent/parallel? I know about the threads work 
> and new GC in 0.4.
>
> 5. Subarrays ("array slicing"?). Not really what I consider a problem, 
> compared to say C (and Python?). I know 0.4 did optimize it, but what 
> languages do similar stuff? Functional ones?
>
> 6. In theory, pure functional languages "should" be faster. Are they in 
> practice in many or any case? Julia has non-mutable state if needed but 
> maybe not as powerful? This seems a double-edged sword. I think Julia 
> designers intentionally chose mutable state to conserve memory. Pros and 
> cons? Mostly Pros for Julia?
>
> 7. Startup time. Python is faster and for say web use, or compared to PHP 
> could be an issue, but would be solved by not doing CGI-style web. How 
> good/fast is Julia/the libraries right now for say web use? At least for 
> long running programs (intended target of Julia) startup time is not an 
> issue.
>
> 8. MPI, do not know enough about it and parallel in general, seems you are 
> doing a good job. I at least think there is no inherent limitation. At 
> least Python is not in any way better for parallel/concurrent?
>
> 9. Autoparallel. Julia doesn't try to be, but could (be an addon?). Is 
> anyone doing really good and could outperform manual Julia?
>
> 10. Any other I'm missing?
>
>
> Wouldn't any of the above or any you can think of be considered 
> performance bugs? I know for libraries you are very aggressive. I'm 
> thinking about Julia as a core language mostly, but maybe you are already 
> fastest already for most math stuff (if implemented at all)?
>
>
> I know to g

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-04-30 Thread Scott Jones
Yes... Python will win on string processing... esp. with Python 3... I 
quickly ran into things that were > 800x faster in Python...
(I hope to help change that though!)

Scott

On Thursday, April 30, 2015 at 6:01:45 PM UTC-4, Páll Haraldsson wrote:
>
> I wouldn't expect a difference in Julia for code like that (didn't check). 
> But I guess what we are often seeing is someone comparing a tuned Python 
> code to newbie Julia code. I still want it faster than that code.. 
> (assuming same algorithm, note row vs. column major caveat).
>
> The main point of mine, *should* Python at any time win?
>
> 2015-04-30 21:36 GMT+00:00 Sisyphuss >:
>
>> This post interests me. I'll write something here to follow this post.
>>
>> The performance gap between normal code in Python and badly-written code 
>> in Julia is something I'd like to know too.
>> As far as I know, Python interpret does some mysterious optimizations. 
>> For example `(x**2)**2` is 100x faster than `x**4`.
>>
>>
>>
>>
>> On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote:
>>>
>>>
>>> Hi,
>>>
>>> [As a best language is subjective, I'll put that aside for a moment.]
>>>
>>> Part I.
>>>
>>> The goal, as I understand, for Julia is at least within a factor of two 
>>> of C and already matching it mostly and long term beating that (and C++). 
>>> [What other goals are there? How about 0.4 now or even 1.0..?]
>>>
>>> While that is the goal as a language, you can write slow code in any 
>>> language and Julia makes that easier. :) [If I recall, Bezanson mentioned 
>>> it (the global "problem") as a feature, any change there?]
>>>
>>>
>>> I've been following this forum for months and newbies hit the same 
>>> issues. But almost always without fail, Julia can be speed up (easily as 
>>> Tim Holy says). I'm thinking about the exceptions to that - are there any 
>>> left? And about the "first code slowness" (see Part II).
>>>
>>> Just recently the last two flaws of Julia that I could see where fixed: 
>>> Decimal floating point is in (I'll look into the 100x slowness, that is 
>>> probably to be expected of any language, still I think may be a 
>>> misunderstanding and/or I can do much better). And I understand the tuple 
>>> slowness has been fixed (that was really the only "core language" defect). 
>>> The former wasn't a performance problem (mostly a non existence problem and 
>>> correctness one (where needed)..).
>>>
>>>
>>> Still we see threads like this one recent one:
>>>
>>> https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw
>>> "It seems changing the order of nested loops also helps"
>>>
>>> Obviously Julia can't beat assembly but really C/Fortran is already 
>>> close enough (within a small factor). The above row vs. column major 
>>> (caching effects in general) can kill performance in all languages. Putting 
>>> that newbie mistake aside, is there any reason Julia can be within a small 
>>> factor of assembly (or C) in all cases already?
>>>
>>>
>>> Part II.
>>>
>>> Except for caching issues, I still want the most newbie code or 
>>> intentionally brain-damaged code to run faster than at least 
>>> Python/scripting/interpreted languages.
>>>
>>> Potential problems (that I think are solved or at least not problems in 
>>> theory):
>>>
>>> 1. I know Any kills performance. Still, isn't that the default in Python 
>>> (and Ruby, Perl?)? Is there a good reason Julia can't be faster than at 
>>> least all the so-called scripting languages in all cases (excluding small 
>>> startup overhead, see below)?
>>>
>>> 2. The global issue, not sure if that slows other languages down, say 
>>> Python. Even if it doesn't, should Julia be slower than Python because of 
>>> global?
>>>
>>> 3. Garbage collection. I do not see that as a problem, incorrect? Mostly 
>>> performance variability ("[3D] games" - subject for another post, as I'm 
>>> not sure that is even a problem in theory..). Should reference counting 
>>> (Python) be faster? On the contrary, I think RC and even manual memory 
>>> management could be slower.
>>>
>>> 4. Concurrency, see nr. 3. GC may or may not have an issue with it. It 
>>> can be a problem, what about in Julia? There are concurrent GC algorithms 
>>> and/or real-time (just not in Julia). Other than GC is there any big 
>>> (potential) problem for concurrent/parallel? I know about the threads work 
>>> and new GC in 0.4.
>>>
>>> 5. Subarrays ("array slicing"?). Not really what I consider a problem, 
>>> compared to say C (and Python?). I know 0.4 did optimize it, but what 
>>> languages do similar stuff? Functional ones?
>>>
>>> 6. In theory, pure functional languages "should" be faster. Are they in 
>>> practice in many or any case? Julia has non-mutable state if needed but 
>>> maybe not as powerful? This seems a double-edged sword. I think Julia 
>>> designers intentionally chose mutable state to conserve memory. Pros and 
>>> cons? Mostly Pros for Julia?
>>>
>>> 7. Startup time. Python is faster and for

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-04-30 Thread Páll Haraldsson
I wouldn't expect a difference in Julia for code like that (didn't check).
But I guess what we are often seeing is someone comparing a tuned Python
code to newbie Julia code. I still want it faster than that code..
(assuming same algorithm, note row vs. column major caveat).

The main point of mine, *should* Python at any time win?

2015-04-30 21:36 GMT+00:00 Sisyphuss :

> This post interests me. I'll write something here to follow this post.
>
> The performance gap between normal code in Python and badly-written code
> in Julia is something I'd like to know too.
> As far as I know, Python interpret does some mysterious optimizations. For
> example `(x**2)**2` is 100x faster than `x**4`.
>
>
>
>
> On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote:
>>
>>
>> Hi,
>>
>> [As a best language is subjective, I'll put that aside for a moment.]
>>
>> Part I.
>>
>> The goal, as I understand, for Julia is at least within a factor of two
>> of C and already matching it mostly and long term beating that (and C++).
>> [What other goals are there? How about 0.4 now or even 1.0..?]
>>
>> While that is the goal as a language, you can write slow code in any
>> language and Julia makes that easier. :) [If I recall, Bezanson mentioned
>> it (the global "problem") as a feature, any change there?]
>>
>>
>> I've been following this forum for months and newbies hit the same
>> issues. But almost always without fail, Julia can be speed up (easily as
>> Tim Holy says). I'm thinking about the exceptions to that - are there any
>> left? And about the "first code slowness" (see Part II).
>>
>> Just recently the last two flaws of Julia that I could see where fixed:
>> Decimal floating point is in (I'll look into the 100x slowness, that is
>> probably to be expected of any language, still I think may be a
>> misunderstanding and/or I can do much better). And I understand the tuple
>> slowness has been fixed (that was really the only "core language" defect).
>> The former wasn't a performance problem (mostly a non existence problem and
>> correctness one (where needed)..).
>>
>>
>> Still we see threads like this one recent one:
>>
>> https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw
>> "It seems changing the order of nested loops also helps"
>>
>> Obviously Julia can't beat assembly but really C/Fortran is already close
>> enough (within a small factor). The above row vs. column major (caching
>> effects in general) can kill performance in all languages. Putting that
>> newbie mistake aside, is there any reason Julia can be within a small
>> factor of assembly (or C) in all cases already?
>>
>>
>> Part II.
>>
>> Except for caching issues, I still want the most newbie code or
>> intentionally brain-damaged code to run faster than at least
>> Python/scripting/interpreted languages.
>>
>> Potential problems (that I think are solved or at least not problems in
>> theory):
>>
>> 1. I know Any kills performance. Still, isn't that the default in Python
>> (and Ruby, Perl?)? Is there a good reason Julia can't be faster than at
>> least all the so-called scripting languages in all cases (excluding small
>> startup overhead, see below)?
>>
>> 2. The global issue, not sure if that slows other languages down, say
>> Python. Even if it doesn't, should Julia be slower than Python because of
>> global?
>>
>> 3. Garbage collection. I do not see that as a problem, incorrect? Mostly
>> performance variability ("[3D] games" - subject for another post, as I'm
>> not sure that is even a problem in theory..). Should reference counting
>> (Python) be faster? On the contrary, I think RC and even manual memory
>> management could be slower.
>>
>> 4. Concurrency, see nr. 3. GC may or may not have an issue with it. It
>> can be a problem, what about in Julia? There are concurrent GC algorithms
>> and/or real-time (just not in Julia). Other than GC is there any big
>> (potential) problem for concurrent/parallel? I know about the threads work
>> and new GC in 0.4.
>>
>> 5. Subarrays ("array slicing"?). Not really what I consider a problem,
>> compared to say C (and Python?). I know 0.4 did optimize it, but what
>> languages do similar stuff? Functional ones?
>>
>> 6. In theory, pure functional languages "should" be faster. Are they in
>> practice in many or any case? Julia has non-mutable state if needed but
>> maybe not as powerful? This seems a double-edged sword. I think Julia
>> designers intentionally chose mutable state to conserve memory. Pros and
>> cons? Mostly Pros for Julia?
>>
>> 7. Startup time. Python is faster and for say web use, or compared to PHP
>> could be an issue, but would be solved by not doing CGI-style web. How
>> good/fast is Julia/the libraries right now for say web use? At least for
>> long running programs (intended target of Julia) startup time is not an
>> issue.
>>
>> 8. MPI, do not know enough about it and parallel in general, seems you
>> are doing a good job. I at least think there is no inherent

[julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-04-30 Thread Sisyphuss
This post interests me. I'll write something here to follow this post.

The performance gap between normal code in Python and badly-written code in 
Julia is something I'd like to know too.
As far as I know, Python interpret does some mysterious optimizations. For 
example `(x**2)**2` is 100x faster than `x**4`.



On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote:
>
>
> Hi,
>
> [As a best language is subjective, I'll put that aside for a moment.]
>
> Part I.
>
> The goal, as I understand, for Julia is at least within a factor of two of 
> C and already matching it mostly and long term beating that (and C++). 
> [What other goals are there? How about 0.4 now or even 1.0..?]
>
> While that is the goal as a language, you can write slow code in any 
> language and Julia makes that easier. :) [If I recall, Bezanson mentioned 
> it (the global "problem") as a feature, any change there?]
>
>
> I've been following this forum for months and newbies hit the same issues. 
> But almost always without fail, Julia can be speed up (easily as Tim Holy 
> says). I'm thinking about the exceptions to that - are there any left? And 
> about the "first code slowness" (see Part II).
>
> Just recently the last two flaws of Julia that I could see where fixed: 
> Decimal floating point is in (I'll look into the 100x slowness, that is 
> probably to be expected of any language, still I think may be a 
> misunderstanding and/or I can do much better). And I understand the tuple 
> slowness has been fixed (that was really the only "core language" defect). 
> The former wasn't a performance problem (mostly a non existence problem and 
> correctness one (where needed)..).
>
>
> Still we see threads like this one recent one:
>
> https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw
> "It seems changing the order of nested loops also helps"
>
> Obviously Julia can't beat assembly but really C/Fortran is already close 
> enough (within a small factor). The above row vs. column major (caching 
> effects in general) can kill performance in all languages. Putting that 
> newbie mistake aside, is there any reason Julia can be within a small 
> factor of assembly (or C) in all cases already?
>
>
> Part II.
>
> Except for caching issues, I still want the most newbie code or 
> intentionally brain-damaged code to run faster than at least 
> Python/scripting/interpreted languages.
>
> Potential problems (that I think are solved or at least not problems in 
> theory):
>
> 1. I know Any kills performance. Still, isn't that the default in Python 
> (and Ruby, Perl?)? Is there a good reason Julia can't be faster than at 
> least all the so-called scripting languages in all cases (excluding small 
> startup overhead, see below)?
>
> 2. The global issue, not sure if that slows other languages down, say 
> Python. Even if it doesn't, should Julia be slower than Python because of 
> global?
>
> 3. Garbage collection. I do not see that as a problem, incorrect? Mostly 
> performance variability ("[3D] games" - subject for another post, as I'm 
> not sure that is even a problem in theory..). Should reference counting 
> (Python) be faster? On the contrary, I think RC and even manual memory 
> management could be slower.
>
> 4. Concurrency, see nr. 3. GC may or may not have an issue with it. It can 
> be a problem, what about in Julia? There are concurrent GC algorithms 
> and/or real-time (just not in Julia). Other than GC is there any big 
> (potential) problem for concurrent/parallel? I know about the threads work 
> and new GC in 0.4.
>
> 5. Subarrays ("array slicing"?). Not really what I consider a problem, 
> compared to say C (and Python?). I know 0.4 did optimize it, but what 
> languages do similar stuff? Functional ones?
>
> 6. In theory, pure functional languages "should" be faster. Are they in 
> practice in many or any case? Julia has non-mutable state if needed but 
> maybe not as powerful? This seems a double-edged sword. I think Julia 
> designers intentionally chose mutable state to conserve memory. Pros and 
> cons? Mostly Pros for Julia?
>
> 7. Startup time. Python is faster and for say web use, or compared to PHP 
> could be an issue, but would be solved by not doing CGI-style web. How 
> good/fast is Julia/the libraries right now for say web use? At least for 
> long running programs (intended target of Julia) startup time is not an 
> issue.
>
> 8. MPI, do not know enough about it and parallel in general, seems you are 
> doing a good job. I at least think there is no inherent limitation. At 
> least Python is not in any way better for parallel/concurrent?
>
> 9. Autoparallel. Julia doesn't try to be, but could (be an addon?). Is 
> anyone doing really good and could outperform manual Julia?
>
> 10. Any other I'm missing?
>
>
> Wouldn't any of the above or any you can think of be considered 
> performance bugs? I know for libraries you are very aggressive. I'm 
> thinking about Julia as a core language mos