from:"Steven Sagaert"

[julia-users] Re: product of Int64 and Float64 is Int64

2016-10-19 Thread Steven Sagaert

agreed

On Wednesday, October 19, 2016 at 3:53:00 PM UTC+2, Krisztián Pintér wrote:
>
>
> i know i shouldn't, but i'm kinda angry at this "1." notation. saving one 
> character really worth losing readability? also leading to errors like 
> this. personally, i would not even allow this syntax at all.
>
> On Wednesday, October 19, 2016 at 1:11:38 PM UTC+2, Michele Zaffalon wrote:
>>
>> I am confused by the type of the result of `1.*80`, which is `Int64`, 
>> despite the fact that `1.` is `Float64`, and that `Float64(1)*80` is a 
>> `Float64`:
>>
>

[julia-users] Re: Parallel file access

2016-10-18 Thread Steven Sagaert

Well if you want multiple processes to write into the db you should use one 
that can handle concurrency, i.e. a "real" DB not a simple desktop/embedded 
DB like SQLlite. So for example Postgres or if you do not want to deal with 
SQL then use a NOSQL db e.g. mongodb (there are many more). For a column 
store relational DB (good for analytics): monetDB.
If you still want all the data in one file at the end then write a program 
that at the end exports the data from the db to a file (that program is a 
single process so no concurrency issues).

 You could also do everything in-memory and let it serialize to disk async 
: e.g. Apache ignite (there are a bunch of others).
There's also sciDB for array-oriented DB.

This is just a small sample of possibilities, If you want a pure julia 
solution, then you could do it with the julia multiprocessing functionality 
but you'll have to work with locking to coördinate between the processes 
(i.e. it isn't just the typical trivial "divide and conquer" data 
parallelism anymore).


On Monday, October 17, 2016 at 7:07:28 PM UTC+2, Zachary Roth wrote:
>
> Thanks for the responses.
>
> Raph, thank you again.  I very much appreciate your "humble offering". 
>  I'll take a further look into your gist.
>
> Steven, I'm happy to use the right tool for the job...so long as I have an 
> idea of what it is.  Would you care to offer more insights or suggestions 
> for the ill-informed (such as myself)?
>
> ---Zachary
>
>
>
> On Sunday, October 16, 2016 at 7:51:19 AM UTC-4, Steven Sagaert wrote:
>>
>> that because SQLLite isn't a multi-user DB server but a single user 
>> embedded (desktop) db. Use the right tool for the job.
>>
>> On Saturday, October 15, 2016 at 7:02:58 PM UTC+2, Ralph Smith wrote:
>>>
>>> How are the processes supposed to interact with the database?  Without 
>>> extra synchronization logic, SQLite.jl gives (occasionally)
>>> ERROR: LoadError: On worker 2:
>>> SQLite.SQLiteException("database is locked")
>>> which on the face of it suggests that all workers are using the same 
>>> connection, although I opened the DB separately in each process.
>>> (I think we should get "busy" instead of "locked", but then still have 
>>> no good way to test for this and wait for a wake-up signal.)
>>> So we seem to be at least as badly off as the original post, except with 
>>> DB calls instead of simple writes.
>>>
>>> We shouldn't have to stand up a separate multithreaded DB server just 
>>> for this. Would you be kind enough to give us an example of simple (i.e. 
>>> not client-server) multiprocess DB access in Julia?
>>>
>>> On Saturday, October 15, 2016 at 9:40:17 AM UTC-4, Steven Sagaert wrote:
>>>>
>>>> It still surprises me how in the scientific computing field people 
>>>> still refuse to learn about databases and then replicate database 
>>>> functionality in files in a complicated and probably buggy way. HDF5  is 
>>>> one example, there are many others. If you want to to fancy search (i.e. 
>>>> speedup search via indices) or do things like parallel writes/concurrency 
>>>> you REALLY should use databases. That's what they were invented for 
>>>> decades 
>>>> ago. Nowadays there a bigger choice than ever: Relational or 
>>>> non-relational 
>>>> (NOSQL), single host or distributed, web interface or not,  disk-based or 
>>>> in-memory,... There really is no excuse anymore not to use a database if 
>>>> you want to go beyond just reading in a bunch of data in one go in memory.
>>>>
>>>> On Monday, October 10, 2016 at 5:09:39 PM UTC+2, Zachary Roth wrote:
>>>>>
>>>>> Hi, everyone,
>>>>>
>>>>> I'm trying to save to a single file from multiple worker processes, 
>>>>> but don't know of a nice way to coordinate this.  When I don't 
>>>>> coordinate, 
>>>>> saving works fine much of the time.  But I sometimes get errors with 
>>>>> reading/writing of files, which I'm assuming is happening because 
>>>>> multiple 
>>>>> processes are trying to use the same file simultaneously.
>>>>>
>>>>> I tried to coordinate this with a queue/channel of `Condition`s 
>>>>> managed by a task running in process 1, but this isn't working for me. 
>>>>>  I've tried to simiplify this to track down the problem.  At least part 
>>>>> of 
>>>>> the issue seems to be writing to the channel from process 2.  
>>>>> Specifically, 
>>>>> when I `put!` something onto a channel (or `push!` onto an array) from 
>>>>> process 2, the channel/array is still empty back on process 1.  I feel 
>>>>> like 
>>>>> I'm missing something simple.  Is there an easier way to go about 
>>>>> coordinating multiple processes that are trying to access the same file? 
>>>>>  If not, does anyone have any tips?
>>>>>
>>>>> Thanks for any help you can offer.
>>>>>
>>>>> Cheers,
>>>>> ---Zachary
>>>>>
>>>>

[julia-users] Re: Parallel file access

2016-10-16 Thread Steven Sagaert

that because SQLLite isn't a multi-user DB server but a single user 
embedded (desktop) db. Use the right tool for the job.

On Saturday, October 15, 2016 at 7:02:58 PM UTC+2, Ralph Smith wrote:
>
> How are the processes supposed to interact with the database?  Without 
> extra synchronization logic, SQLite.jl gives (occasionally)
> ERROR: LoadError: On worker 2:
> SQLite.SQLiteException("database is locked")
> which on the face of it suggests that all workers are using the same 
> connection, although I opened the DB separately in each process.
> (I think we should get "busy" instead of "locked", but then still have no 
> good way to test for this and wait for a wake-up signal.)
> So we seem to be at least as badly off as the original post, except with 
> DB calls instead of simple writes.
>
> We shouldn't have to stand up a separate multithreaded DB server just for 
> this. Would you be kind enough to give us an example of simple (i.e. not 
> client-server) multiprocess DB access in Julia?
>
> On Saturday, October 15, 2016 at 9:40:17 AM UTC-4, Steven Sagaert wrote:
>>
>> It still surprises me how in the scientific computing field people still 
>> refuse to learn about databases and then replicate database functionality 
>> in files in a complicated and probably buggy way. HDF5  is one example, 
>> there are many others. If you want to to fancy search (i.e. speedup search 
>> via indices) or do things like parallel writes/concurrency you REALLY 
>> should use databases. That's what they were invented for decades ago. 
>> Nowadays there a bigger choice than ever: Relational or non-relational 
>> (NOSQL), single host or distributed, web interface or not,  disk-based or 
>> in-memory,... There really is no excuse anymore not to use a database if 
>> you want to go beyond just reading in a bunch of data in one go in memory.
>>
>> On Monday, October 10, 2016 at 5:09:39 PM UTC+2, Zachary Roth wrote:
>>>
>>> Hi, everyone,
>>>
>>> I'm trying to save to a single file from multiple worker processes, but 
>>> don't know of a nice way to coordinate this.  When I don't coordinate, 
>>> saving works fine much of the time.  But I sometimes get errors with 
>>> reading/writing of files, which I'm assuming is happening because multiple 
>>> processes are trying to use the same file simultaneously.
>>>
>>> I tried to coordinate this with a queue/channel of `Condition`s managed 
>>> by a task running in process 1, but this isn't working for me.  I've tried 
>>> to simiplify this to track down the problem.  At least part of the issue 
>>> seems to be writing to the channel from process 2.  Specifically, when I 
>>> `put!` something onto a channel (or `push!` onto an array) from process 2, 
>>> the channel/array is still empty back on process 1.  I feel like I'm 
>>> missing something simple.  Is there an easier way to go about coordinating 
>>> multiple processes that are trying to access the same file?  If not, does 
>>> anyone have any tips?
>>>
>>> Thanks for any help you can offer.
>>>
>>> Cheers,
>>> ---Zachary
>>>
>>

[julia-users] Re: Parallel file access

2016-10-15 Thread Steven Sagaert

It still surprises me how in the scientific computing field people still 
refuse to learn about databases and then replicate database functionality 
in files in a complicated and probably buggy way. HDF5  is one example, 
there are many others. If you want to to fancy search (i.e. speedup search 
via indices) or do things like parallel writes/concurrency you REALLY 
should use databases. That's what they were invented for decades ago. 
Nowadays there a bigger choice than ever: Relational or non-relational 
(NOSQL), single host or distributed, web interface or not,  disk-based or 
in-memory,... There really is no excuse anymore not to use a database if 
you want to go beyond just reading in a bunch of data in one go in memory.

On Monday, October 10, 2016 at 5:09:39 PM UTC+2, Zachary Roth wrote:
>
> Hi, everyone,
>
> I'm trying to save to a single file from multiple worker processes, but 
> don't know of a nice way to coordinate this.  When I don't coordinate, 
> saving works fine much of the time.  But I sometimes get errors with 
> reading/writing of files, which I'm assuming is happening because multiple 
> processes are trying to use the same file simultaneously.
>
> I tried to coordinate this with a queue/channel of `Condition`s managed by 
> a task running in process 1, but this isn't working for me.  I've tried to 
> simiplify this to track down the problem.  At least part of the issue seems 
> to be writing to the channel from process 2.  Specifically, when I `put!` 
> something onto a channel (or `push!` onto an array) from process 2, the 
> channel/array is still empty back on process 1.  I feel like I'm missing 
> something simple.  Is there an easier way to go about coordinating multiple 
> processes that are trying to access the same file?  If not, does anyone 
> have any tips?
>
> Thanks for any help you can offer.
>
> Cheers,
> ---Zachary
>

[julia-users] Re: Julia-i18n logo proposal

2016-10-05 Thread Steven Sagaert

How about using fundamental constants?
either from mathematics: pi, e, i
or from physics : G, h, c

On Friday, September 30, 2016 at 2:47:04 AM UTC+2, Waldir Pimenta wrote:
>
> Hi all. I made a proposal for the logo for the Julia-i18n organization: 
> http://imgh.us/julia-i18n_1.svg
>
> It uses the three most used scripts worldwide, and the characters are 
> actually the start of the word “Julia” as written in each of those scripts.
>
> Looking forward to know what you guys think.
>
> --Waldir
>

[julia-users] Re: Idea: Julia Standard Libraries and Distributions

2016-09-13 Thread Steven Sagaert

for me a distribution is more than just a gobbled together bunch of 
disparate packages: ideally it should have a common style and work with 
common datastructures for input/ouput (between methods) to exchange data. 
That's the real crux of the problem, not the fact that you need to manually 
import packages or not. 

On Tuesday, September 13, 2016 at 4:49:50 PM UTC+2, Randy Zwitch wrote:
>
> "Also, there's a good reason to ask "why fuss with distributions when 
> anyone could just add the packages and add the import statements to their 
> .juliarc?" (though its target audience is for people who don't know details 
> like the .juliarc, but also want Julia to work seamlessly like MATLAB)."
>
> I feel like if you're using MATLAB, it should be a really small step to 
> teach about the .juliarc file, as oppose to the amount of 
> maintenance/fragmentation that comes along with multiple distributions.
>
> Point #1 makes sense for me, if only because that's a use case that can't 
> be accomplished through simple textfile manipulation
>
> On Tuesday, September 13, 2016 at 4:39:15 AM UTC-4, Chris Rackauckas wrote:
>>
>> I think one major point of contention when talking about what should be 
>> included in Base due to competing factors:
>>
>>
>>1. Some people would like a "lean Base" for things like embedded 
>>installs or other low memory applications
>>2. Some people want a MATLAB-like "bells and whistles" approach. This 
>>way all the functions they use are just there: no extra packages to 
>>find/import.
>>3. Some people like having things in Base because it "standardizes" 
>>things. 
>>4. Putting things in Base constrains their release schedule. 
>>5. Putting things in packages outside of JuliaLang helps free up 
>>Travis.
>>
>>
>> The last two concerns have been why things like JuliaMath have sprung up 
>> to move things out of Base. However, I think there is some credibility to 
>> having some form of standardization. I think this can be achieved through 
>> some kind of standard library. This would entail a set of packages which 
>> are installed when Julia is installed, and a set of packages which add 
>> their using statement to the .juliarc. To most users this would be 
>> seamless: they would install automatically, and every time you open Julia, 
>> they would import automatically. There are a few issues there:
>>
>>
>>1.  This wouldn't work with building from source. This idea works 
>>better for binaries (this is no biggie since these users are likely more 
>>experienced anyways)
>>2. Julia would have to pick winners and losers.
>>
>> That second part is big: what goes into the standard library? Would all 
>> of the MATLAB functions like linspace, find, etc. go there? Would the 
>> sparse matrix library be included?
>>
>> I think one way to circumvent the second issue would be to allow for 
>> Julia Distributions. A distribution would be defined by:
>>
>>
>>1. A Julia version
>>2. A List of packages to install (with versions?)
>>3. A build script
>>4. A .juliarc
>>
>> The ideal would be for one to be able to make an executable from those 
>> parts which would install the Julia version with the specified packages, 
>> build the packages (and maybe modify some environment variables / 
>> defaults), and add a .juliarc that would automatically import some packages 
>> / maybe define some constants or checkout branches. JuliaLang could then 
>> provide a lean distribution and a "standard distribution" where the 
>> standard distribution is a more curated library which people can fight 
>> about, but it's not as big of a deal if anyone can make their own. This has 
>> many upsides:
>>
>>
>>1. Julia wouldn't have to come with what you don't want.
>>2. Other than some edge cases where the advantages of Base come into 
>>play (I don't know of a good example, but I know there are some things 
>>which can't be defined outside of Base really well, like BigFloats? I'm 
>> not 
>>the expert on this.), most things could spawn out to packages without the 
>>standard user ever noticing.
>>3. There would still be a large set of standard functions you can 
>>assume most people will have.
>>4. You can share Julia setups: for example, with my lab I would share 
>>a distribution that would have all of the JuliaDiffEq packages installed, 
>>along with Plots.jl and some backends, so that way it would be "in the 
>> box 
>>solve differential equations and plot" setup like what MATLAB provides. I 
>>could pick packages/versions that I know work well together, 
>>and guarantee their install will work. 
>>5. You could write tutorials / run workshops which use a 
>>distribution, knowing that a given set of packages will be available.
>>6. Anyone could make their setup match yours by looking at the 
>>distribution setup scripts (maybe just make a base function which runs 
>> that 
>>

[julia-users] Re: Idea: Julia Standard Libraries and Distributions

2016-09-13 Thread Steven Sagaert

I' m in favor of this. In fact I asked for the same thing 
in https://groups.google.com/forum/#!topic/julia-users/3g8zXaXfQqk although 
in a more cryptic way :)

BTW: java already has something like this:  next to the 2 big standard 
distributions javaSE & javaEE (there's also a third specialized javaME but 
that's paying and incompatibel), there are now more fine grained 
distributions called "profiles". In java 9 with modules it will be even 
easier to create more profiles/distributions.


On Tuesday, September 13, 2016 at 10:39:15 AM UTC+2, Chris Rackauckas wrote:
>
> I think one major point of contention when talking about what should be 
> included in Base due to competing factors:
>
>
>1. Some people would like a "lean Base" for things like embedded 
>installs or other low memory applications
>2. Some people want a MATLAB-like "bells and whistles" approach. This 
>way all the functions they use are just there: no extra packages to 
>find/import.
>3. Some people like having things in Base because it "standardizes" 
>things. 
>4. Putting things in Base constrains their release schedule. 
>5. Putting things in packages outside of JuliaLang helps free up 
>Travis.
>
>
> The last two concerns have been why things like JuliaMath have sprung up 
> to move things out of Base. However, I think there is some credibility to 
> having some form of standardization. I think this can be achieved through 
> some kind of standard library. This would entail a set of packages which 
> are installed when Julia is installed, and a set of packages which add 
> their using statement to the .juliarc. To most users this would be 
> seamless: they would install automatically, and every time you open Julia, 
> they would import automatically. There are a few issues there:
>
>
>1.  This wouldn't work with building from source. This idea works 
>better for binaries (this is no biggie since these users are likely more 
>experienced anyways)
>2. Julia would have to pick winners and losers.
>
> That second part is big: what goes into the standard library? Would all of 
> the MATLAB functions like linspace, find, etc. go there? Would the sparse 
> matrix library be included?
>
> I think one way to circumvent the second issue would be to allow for Julia 
> Distributions. A distribution would be defined by:
>
>
>1. A Julia version
>2. A List of packages to install (with versions?)
>3. A build script
>4. A .juliarc
>
> The ideal would be for one to be able to make an executable from those 
> parts which would install the Julia version with the specified packages, 
> build the packages (and maybe modify some environment variables / 
> defaults), and add a .juliarc that would automatically import some packages 
> / maybe define some constants or checkout branches. JuliaLang could then 
> provide a lean distribution and a "standard distribution" where the 
> standard distribution is a more curated library which people can fight 
> about, but it's not as big of a deal if anyone can make their own. This has 
> many upsides:
>
>
>1. Julia wouldn't have to come with what you don't want.
>2. Other than some edge cases where the advantages of Base come into 
>play (I don't know of a good example, but I know there are some things 
>which can't be defined outside of Base really well, like BigFloats? I'm 
> not 
>the expert on this.), most things could spawn out to packages without the 
>standard user ever noticing.
>3. There would still be a large set of standard functions you can 
>assume most people will have.
>4. You can share Julia setups: for example, with my lab I would share 
>a distribution that would have all of the JuliaDiffEq packages installed, 
>along with Plots.jl and some backends, so that way it would be "in the box 
>solve differential equations and plot" setup like what MATLAB provides. I 
>could pick packages/versions that I know work well together, 
>and guarantee their install will work. 
>5. You could write tutorials / run workshops which use a distribution, 
>knowing that a given set of packages will be available.
>6. Anyone could make their setup match yours by looking at the 
>distribution setup scripts (maybe just make a base function which runs 
> that 
>install since it would all be in Julia). This would be nice for some work 
>in progress projects which require checking out master on 3 different 
>packages, and getting some weird branch for another 5. I would give you a 
>succinct and standardized way to specify an install to get there.
>
>
> Side notes:
>
> [An interesting distribution would be that JuliaGPU could provide a full 
> distribution for which CUDAnative works (since it requires a different 
> Julia install)]
>
> [A "Data Science Distribution" would be a cool idea: you'd never want to 
> include all of the plotting and statistical things inside of

Re: [julia-users] I'd like to see something like the Rust platform proposal in the Julia ecosystem

2016-08-01 Thread Steven Sagaert

When I say "work well together" I don't just mean that their versions 
technically work together without errors, but also that they match 
stylistically and that the datastructures that they expect as input/output 
match so that no excessive translation and/or copying of data is needed 
which is bad for performance and style.  That kind of discussion is for 
example happening in OCAML to come to a platform and how to resolve the 
ocaml standard lib vs Jane street lib schism.

On Monday, August 1, 2016 at 5:19:17 PM UTC+2, Steven Sagaert wrote:
>
> I think the most important part of it is the idea of having a second 
> (beyond the standard lib that comes with the runtime) larger, optional 
> layer of curated libs that are known to work together. That together with 
> the metapackage idea for easy inclusion ( maybe with possible overrides as 
> in the Rust proposal) would be very handy for people who do not do Julia 
> coding all the time and hence cannot follow the larger package ecosystem 
> closely. People who do not want this second layer could still just use the 
> standard lib + whatever packages they want. One could even extend this to 
> multiple layer, each one more optional and lighter curated: standard lib -> 
> platform light -> extended paltform -> ...
>
> Now there is already a good attempt in the julia ecosystem to group 
> related packages in webpages and try to avoid too much libraries that do 
> the same or partially overlap (more like scientific Python, rather than the 
> R jungle) and that's great, but per group there still  are several 
> competing packages and sometimes it's unclear from the descriptions to pick 
> a clear winner. A curated subset of these "the platform"  by the community 
> that adrresses the most common needs except maybe for special niches, would 
> be very helpful. That's all :)
>
> On Monday, August 1, 2016 at 4:33:24 PM UTC+2, Stefan Karpinski wrote:
>>
>> There's a fair amount of discussion of the Rust Platform proposal over 
>> here:
>>
>> https://internals.rust-lang.org/t/proposal-the-rust-platform/3745
>>
>> In short there's a lack of agreement to this in Rust. Moreover, in Rust, 
>> different versions of libraries are much more closely locked to each other, 
>> whereas in Julia the coupling is much looser. Steven, since you're in favor 
>> of this idea, can you explain why you think it's a good idea for Julia? 
>> What problems does it solve?
>>
>> On Mon, Aug 1, 2016 at 7:31 AM, Tony Kelman <to...@kelman.net> wrote:
>>
>>> The vision I personally have for this would be something more like SUSE 
>>> Studio (https://susestudio.com/) where it's just a few clicks, or a 
>>> configuration file in the build system, that could give you a set of 
>>> default-installed packages of your choosing, and make installers for your 
>>> own custom "spins" of a Julia-with-packages distribution.
>>>
>>>
>>>
>>> On Monday, August 1, 2016 at 2:08:06 AM UTC-7, Tim Holy wrote:
>>>>
>>>> module MyMetaPackage 
>>>>
>>>> using Reexport 
>>>>
>>>> @reexport using PackageA 
>>>> @reexport using PackageB 
>>>> ... 
>>>>
>>>> end 
>>>>
>>>> Best. 
>>>> --Tim 
>>>>
>>>> On Monday, August 1, 2016 1:48:47 AM CDT Steven Sagaert wrote: 
>>>> > is more than just a webpage with a list of packages... for starters 
>>>> the 
>>>> > concept of metapackage. 
>>>> > 
>>>> > On Monday, August 1, 2016 at 10:25:33 AM UTC+2, Tamas Papp wrote: 
>>>> > > Maybe you already know about it, but there is a curated list of 
>>>> packages 
>>>> > > at https://github.com/svaksha/Julia.jl 
>>>> > > 
>>>> > > On Mon, Aug 01 2016, Steven Sagaert wrote: 
>>>> > > > see https://aturon.github.io/blog/2016/07/27/rust-platform/ 
>>>>
>>>>
>>>>
>>

Re: [julia-users] I'd like to see something like the Rust platform proposal in the Julia ecosystem

2016-08-01 Thread Steven Sagaert

I think the most important part of it is the idea of having a second 
(beyond the standard lib that comes with the runtime) larger, optional 
layer of curated libs that are known to work together. That together with 
the metapackage idea for easy inclusion ( maybe with possible overrides as 
in the Rust proposal) would be very handy for people who do not do Julia 
coding all the time and hence cannot follow the larger package ecosystem 
closely. People who do not want this second layer could still just use the 
standard lib + whatever packages they want. One could even extend this to 
multiple layer, each one more optional and lighter curated: standard lib -> 
platform light -> extended paltform -> ...

Now there is already a good attempt in the julia ecosystem to group related 
packages in webpages and try to avoid too much libraries that do the same 
or partially overlap (more like scientific Python, rather than the R 
jungle) and that's great, but per group there still  are several competing 
packages and sometimes it's unclear from the descriptions to pick a clear 
winner. A curated subset of these "the platform"  by the community that 
adrresses the most common needs except maybe for special niches, would be 
very helpful. That's all :)

On Monday, August 1, 2016 at 4:33:24 PM UTC+2, Stefan Karpinski wrote:
>
> There's a fair amount of discussion of the Rust Platform proposal over 
> here:
>
> https://internals.rust-lang.org/t/proposal-the-rust-platform/3745
>
> In short there's a lack of agreement to this in Rust. Moreover, in Rust, 
> different versions of libraries are much more closely locked to each other, 
> whereas in Julia the coupling is much looser. Steven, since you're in favor 
> of this idea, can you explain why you think it's a good idea for Julia? 
> What problems does it solve?
>
> On Mon, Aug 1, 2016 at 7:31 AM, Tony Kelman <to...@kelman.net 
> > wrote:
>
>> The vision I personally have for this would be something more like SUSE 
>> Studio (https://susestudio.com/) where it's just a few clicks, or a 
>> configuration file in the build system, that could give you a set of 
>> default-installed packages of your choosing, and make installers for your 
>> own custom "spins" of a Julia-with-packages distribution.
>>
>>
>>
>> On Monday, August 1, 2016 at 2:08:06 AM UTC-7, Tim Holy wrote:
>>>
>>> module MyMetaPackage 
>>>
>>> using Reexport 
>>>
>>> @reexport using PackageA 
>>> @reexport using PackageB 
>>> ... 
>>>
>>> end 
>>>
>>> Best. 
>>> --Tim 
>>>
>>> On Monday, August 1, 2016 1:48:47 AM CDT Steven Sagaert wrote: 
>>> > is more than just a webpage with a list of packages... for starters 
>>> the 
>>> > concept of metapackage. 
>>> > 
>>> > On Monday, August 1, 2016 at 10:25:33 AM UTC+2, Tamas Papp wrote: 
>>> > > Maybe you already know about it, but there is a curated list of 
>>> packages 
>>> > > at https://github.com/svaksha/Julia.jl 
>>> > > 
>>> > > On Mon, Aug 01 2016, Steven Sagaert wrote: 
>>> > > > see https://aturon.github.io/blog/2016/07/27/rust-platform/ 
>>>
>>>
>>>
>

Re: [julia-users] I'd like to see something like the Rust platform proposal in the Julia ecosystem

2016-08-01 Thread Steven Sagaert

is more than just a webpage with a list of packages... for starters the 
concept of metapackage.

On Monday, August 1, 2016 at 10:25:33 AM UTC+2, Tamas Papp wrote:
>
> Maybe you already know about it, but there is a curated list of packages 
> at https://github.com/svaksha/Julia.jl 
>
> On Mon, Aug 01 2016, Steven Sagaert wrote: 
>
> > see https://aturon.github.io/blog/2016/07/27/rust-platform/ 
>
>

[julia-users] I'd like to see something like the Rust platform proposal in the Julia ecosystem

2016-08-01 Thread Steven Sagaert

see https://aturon.github.io/blog/2016/07/27/rust-platform/

[julia-users] Re: How to close an HttpServer?

2015-11-24 Thread Steven Sagaert

There isn't a function for that. You can shut it down either by killing the 
process  or by building in your program a "shutdown" message that when it 
receives this http request then exits the julia program by calling quit()

When I wrote a HttpServer based service I also thought this was a missing 
feature of HttpServer.

On Tuesday, November 24, 2015 at 11:42:49 AM UTC+1, Eric Forgy wrote:
>
> Hi,
>
> I remember reading a question similar to (or exactly like) this one, but 
> can't find it again.
>
> I can start an HttpServer easily enough, but how to close it? I can see 
> that WebSockets has a "close" method, but I can't find a way to close an 
> HttpServer. I am probably confused and this question makes no sense :)
>
> Thanks
>

[julia-users] Re: ANN: ParallelAccelerator.jl v0.1 released

2015-10-21 Thread Steven Sagaert

Great news!
Apart from the focus on parallelism, the architecture seems similar to numba 
<http://numba.pydata.org/>. 
Good to finally have shared memory parallelism in Julia (not counting the 
shared memory array under Linux via shm).
I wonder how this is going to interact with the upcoming multithreading in 
julia in the future. Will they play nice together or fight each other?

Sincerely,
Steven Sagaert

On Wednesday, October 21, 2015 at 2:57:17 AM UTC+2, Lindsey Kuper wrote:
>
> The High Performance Scripting team at Intel Labs is pleased to announce 
> the release of version 0.1 of ParallelAccelerator.jl, a package for 
> high-performance parallel computing in Julia.
>
> ParallelAccelerator provides an @acc (short for "accelerate") macro for 
> annotating Julia functions.  Together with a system C compiler (ICC or 
> GCC), it compiles @acc-annotated functions to optimized native code.
>
> Under the hood, ParallelAccelerator is essentially a domain-specific 
> compiler written in Julia. It performs additional analysis and optimization 
> on top of the Julia compiler. ParallelAccelerator discovers and exploits 
> the implicit parallelism in source programs that use parallel programming 
> patterns such as map, reduce, comprehension, and stencil. For example, 
> Julia array operators such as .+, .-, .*, ./ are translated by 
> ParallelAccelerator internally into data-parallel map operations over all 
> elements of input arrays. For the most part, these patterns are already 
> present in standard Julia, so programmers can use ParallelAccelerator to 
> run the same Julia program without (significantly) modifying the source 
> code.
>
> Version 0.1 should be considered an alpha release, suitable for early 
> adopters and Julia enthusiasts.  Please file bugs at 
> https://travis-ci.org/IntelLabs/ParallelAccelerator.jl/issues .
>
> ParallelAccelerator requires Julia v0.4.0.  See our GitHub repository at 
> https://github.com/IntelLabs/ParallelAccelerator.jl for a complete list 
> of prerequisites, supported platforms, example programs, and documentation.
>
> Thanks to our colleagues at Intel and Intel Labs, the Julia team, and the 
> broader Julia community for their support of our efforts!
>
> Best regards,
> The High Performance Scripting team
> (Programming Systems Lab, Intel Labs)
>
>

Re: [julia-users] Re: Implementing mapreduce parallel model (not general multi-threading) ? easy and enough ?

2015-10-07 Thread Steven Sagaert

I think what is meant is that in HPC typically this is done via MPI which 
is just a low level approach where you explicitely have to specify all the 
data communication (compared to Hadoop & Spark where it is implicit).

>
>
>  The only codes that really nail it are carefully handcrafted HPC codes.
>
>
> Could you please elaborate on this? I think I know Spark code quite well, 
> but can't connect it to the notion of handcrafted HPC code. 
>
>
>
>
>>

Re: [julia-users] Re: IDE for Julia

2015-09-21 Thread Steven Sagaert

+1 
& to add to Uwe's post: AFAIK JuliaStudio is based on Qt (& QtCreator I 
believe). I thought that JuliaStudio was a nice start (also based on 
QtCreator). I wish a group would fork it and develop it further in the 
direction of RStudio.

On Friday, September 18, 2015 at 10:08:23 AM UTC+2, Christof Stocker wrote:
>
> I would be a huge fan of an RStudio like Julia IDE
>
> On 2015-09-18 10:05, Uwe Fechner wrote:
>
> I like QT a lot. There is more then one open source, QT based IDE out 
> there, e.g. QT Creator.
> QT has a GUI builder, that is much better then the GUI builders for GTK 
> (in my opinion).
> And you can use the java-script like QML language for building the user 
> interface, if you want to.
>
> Tutorial for PyQT:
> https://pythonspot.com/qml-and-pyqt-creating-a-gui-tutorial/
>
> As soon as the Julia/C++ integration is available by default (hopefully in 
> Julia 0.5), QT integration
> should be easy. For those, that want to play with Julia and QT now 
> already, have a look at:
>
> https://github.com/jverzani/PySide.jl
>
> (very experimental)
>
> Am Freitag, 18. September 2015 08:25:44 UTC+2 schrieb Daniel Carrera: 
>>
>> There are no Qt bindings for Julia yet. I also don't know what text 
>> editing component is provided by Qt or what its features are. I began 
>> working with Gtk in part because the Julia Gtk bindings seem to be the most 
>> developed. 
>>
>> Is there a reason you like Qt besides it being cross-platform?
>>
>>
>> On 17 September 2015 at 23:50, SrAceves  wrote:
>>
>>> What about Qt? RStudio is fantastic: Qt based, multi-platoform. 
>>> Everything anyone ever wanted of an IDE.
>>>
>>> El martes, 15 de septiembre de 2015, 8:13:04 (UTC-5), Daniel Carrera 
>>> escribió: 


 Last night I started experimenting with Gtk, and started making a 
 sketch of what a Julia IDE might look like. In the process I am writing 
 down a list of things that are probably needed before a Julia IDE


  getting a list of things that probably need to exist before a Julia 
 IDE can be completed. This is what I have so far:
 1) A Julia package for the GNOME Docking Library

 I think most people expect that an IDE has docking 

 Despite the name, it does not depend on any GNOME libraries, only Gtk. 
 This is what Anjuta and MonoDevelop use to get docking windows. I think 
 most people expect to be able to move things around in an IDE.

 https://developer.gnome.org/gdl/


 2)



 On 14 September 2015 at 17:10,  wrote:

> Gtk, the code isn't published but it's very similar to Julietta: 
>
> https://github.com/tknopp/Julietta.jl
>


>>
>

[julia-users] Re: What does the `|>` operator do? (possibly a Gtk.jl question)

2015-09-20 Thread Steven Sagaert

Think of it as unix pipes.  F# uses the exact same notation and in fact in 
F# the |> notation is now more prevalent than "regular" function 
application notation because if read left to right instead right to left.

You could also think of it a one special case of  the monadic  (oops! I 
said the m-word) >>= operator in Haskell.

On Sunday, September 20, 2015 at 11:08:59 AM UTC+2, Daniel Carrera wrote:
>
> Looking at the code examples from Gtk.jl I found this code example:
>
> w = Gtk.@Window() |>
> (f = Gtk.@Box(:h) |>
> (b = Gtk.@Button("1")) |>
> (c = Gtk.@Button("2")) |>
> (f2 = Gtk.@Box(:v) |>
>   Gtk.@Label("3") |>
>   Gtk.@Label("4"))) |>
> showall
>
>
> This is just a compact way to create a Gtk window and put some objects in 
> it. But I had never seen that `|>` operator before, and I can't figure out 
> what it's doing. Is this operator somehow unique to Gtk.jl ? It can see 
> that they use it to nest widgets inside containers, but it's not clear to 
> me how it does it.
>
> Cheers,
> Daniel.
>

[julia-users] Sparse matrix type signature in the wrong direction

2015-09-09 Thread Steven Sagaert

I think that SparseMatrixCSC{Tv,Ti}  should be SparseMatrixCSC{Ti,Tv}

Why?
It's inconsistent with the mental picture of a map of integer indices -> 
values & inconsistent with it's analog type signature of a Dict: e.g. 
Dict{String,Float} which is key -> value type.

One might think is an insignificant detail but these little inconsistencies 
in language design do add up and create uncertainty & frustration.

Maybe this could be addressed in a future release (not 0.4)?

[julia-users] Re: Trivia question: Quaternions (ℍ), Octonions (핆), Sedenions (핊) etc. numbers are not supported, but are the in libraries or needed? And complex, vs. in others, say MATLAB..

2015-08-28 Thread Steven Sagaert

quaternions might be useful for 3D rotations but higher order constructs 
like octionions, etc will not be very useful for numerical computing. They 
might be useful to pure mathematicians (or as an alternative math formalism 
for some (speculative) QFT stuff) but not in applied math, and pure 
mathematicians use Mathematica IF they use a computer at all (most of these 
esoteric creatures still stick to pencil  paper :) ).

On Friday, August 28, 2015 at 11:50:45 AM UTC+2, Páll Haraldsson wrote:

 Mostly about math (that I do not know to well, this advanced, above 
 complex numbers), just seems relevant to generic programming..:

 In e.g.:
 https://en.wikipedia.org/wiki/Complex_number

 in the template at the bottom:

 Real numbers and extensions:

 Real numbers (ℝ) Complex numbers (ℂ) Quaternions (ℍ) Octonions (핆) 
 Sedenions (핊) Cayley–Dickson construction Dual numbers Split-complex 
 numbers Bicomplex numbers Hypercomplex numbers Superreal numbers Irrational 
 numbers Transcendental numbers Hyperreal numbers Levi-Civita field Surreal 
 numbers


 http://math.stackexchange.com/questions/86434/what-lies-beyond-the-sedenions


 I see that dual numbers are supported already with a library, and know 
 what Julia does support.


 Since Julia IS a scientific language, I checked:

 julia 1+0im  2+0im
 ERROR: `isless` has no method matching isless(::Complex{Int64}, 
 ::Complex{Int64})
  in  at operators.jl:32

 that is the right thing. I wander what MATLAB does, since all numbers(?) 
 are stored as complex. 0im special cased?


 real numbers have properties that floating point already doesn't support, 
 so loosing those properties with higher order numbers seems not to be an 
 issue, but is it for the even higher order numbers and those properties? I 
 assume not, as there are no operators/functions and even if there where 
 operators (really just functions), then they would just stop working.

 In a language where everything is typed (not generic), this might not be a 
 problem, but leads to run-time ERRORs/exceptions. Should you use ::Real 
 at every point to not get those with Complex? What about other possible 
 numbers?


 Thanks in advance,
 -- 
 Palli.

[julia-users] Re: Julia-lang TCO / femto-lisp TCO

2015-07-07 Thread Steven Sagaert

see http://blog.zachallaun.com/post/jumping-julia to work around not having 
TCO and still use recursion to traverse LARGE data structures without 
stackoverflow. That's also how a bunch of other languages (e.g. Scala  F#) 
do this (called trampolining).

On Sunday, November 24, 2013 at 3:49:14 PM UTC+1, Piotr X wrote:

 Hi! 

 First of all, let me thank you for bringing another language to the world! 
 Always good to see a new star on the sky.

 Coming for the Python and Scheme side of the table (well, it is a round 
 table..), I have a couple of quick question before getting my hands on 
 Julia:

 1) I understand that femto-lisp was integragted into Julia. 
 1.1) Is it possible to to work mainly with femto-lisp, using Julia-lang 
 packages, its concurrency/parallel computing libraries and its fast program 
 generating compiler?
 1.2.) Does tje Julia-lang implementation of femto-lisp have TCO (Tail call 
 optimization)?
 1.3.) What important scheme like features are missing from femto-lisp?

 2) I found a Julia-Dev thread on TCO. Is there any news on that?

 Regards, 

   Piotr

[julia-users] Re: Julia-lang TCO / femto-lisp TCO

2015-07-07 Thread Steven Sagaert

The point was doing it using recursion not iteration. When the data 
structures themselves are recursive, a recursive algorithm can be a lot 
simpler/shorter/more elegant than iteration. That's kind of the whole point 
of functional programming.

On Tuesday, July 7, 2015 at 9:28:01 PM UTC+2, Steven G. Johnson wrote:



 On Tuesday, July 7, 2015 at 11:11:19 AM UTC-4, Steven Sagaert wrote:

 see http://blog.zachallaun.com/post/jumping-julia to work around not 
 having TCO and still use recursion to traverse LARGE data structures 
 without stackoverflow. That's also how a bunch of other languages (e.g. 
 Scala  F#) do this (called trampolining).


 (You could also just use a loop.)

[julia-users] Re: Escher/Compose/Gadfly for heavy visualization

2015-07-02 Thread Steven Sagaert

You might want to check out https://plot.ly.

On Thursday, July 2, 2015 at 4:34:20 PM UTC+2, Tom Breloff wrote:

 Yes the question was intentionally broad, because I wanted to get a 
 birds-eye view of the state of web-visualization in Julia, and whether it's 
 mature/performant enough to compete with a desktop application in OpenGL. 
  I'm just trying to understand if the performance difference is 2x or 
 1000x.  Here's some rough conclusions I've made from my very limited 
 experience with Julia web packages... please tell me what I may be 
 missing/overlooking (and especially where I'm just plain wrong):

 *Escher*
 Seems like this could be the standard common framework for web gui. 
  Ideally I could define a rough gui layout with sliders, etc in Escher, and 
 then include a highly performant module using WebGL or something similar 
 that is just a Tile in Escher.  Does this already exist?

 *Compose/Gadfly*
 Pros: Nice looking and fairly easy to use.  Composition abstracts many 
 details and allows easy reasoning about layouts/structure.  Lots of 
 features.
 Cons: Not very fast (I think) due potentially to some dependence on Cairo 
 and heavy manipulation of DOM?  Gadfly's dependence on DataFrames is 
 generally frustrating... plotting is verbose with unnecessary data 
 manipulation when your data isn't in the expected form. 

 *Bokeh*
 This is certainly worth a look.  Is this 2D-only?  Can it play nicely with 
 Escher?  Do I need to write any python/javascript to create my visuals?

 *Compose3D*
 I came across this after my first post... seems to be on the right 
 track... implementing a compose-like framework based in WebGL.  Are there 
 other similar efforts?  Might there be a Gadfly3D which uses Compose3D as a 
 backend?

 Are there any other similar efforts that haven't been mentioned that I 
 should keep an eye on?  Thanks for all your thoughts.

 On Thursday, July 2, 2015 at 8:33:41 AM UTC-4, Andreas Lobinger wrote:

 I forgot:

 Gadfly and Compose have some future looking plans for 3D.
 Recently there was here:
 https://groups.google.com/d/topic/julia-users/DLFWlN-lj_Y/discussion

[julia-users] Re: Announcement: Escher.jl - a toolkit for beautiful Web UIs in pure Julia

2015-06-09 Thread Steven Sagaert

Looks super! Nice to see such a cool mix of features (functional, reactive, 
websocket, html5, Tex support,...)
I just have one concern: since the GUI is immutable and involves a lot of 
julia code generation (and hence compilation): what's the performance like?

On Monday, June 8, 2015 at 6:23:21 PM UTC+2, Shashi Gowda wrote:

 Hello all!

 I have been working on a package called *Escher* over the past several 
 months.

 It is now quite feature-rich and ready to use in some sense. I put 
 together an overview here:

https://shashi.github.io/Escher.jl/*

 My aim is to converge at a UI toolkit that any Julia programmer can use to 
 create rich interactive GUIs and deploy them over the web, *within 
 minutes*.

 Escher simplifies the web platform into a simple and pleasant pure-Julia 
 library. You don't need to learn or write HTML or CSS or JavaScript. Many 
 problems associated with traditional web development basically disappear. 
 There is no need to write separate front-end and back-end code, layouts are 
 tractable and similar to layouts in the original TeX. Communication is done 
 under-the-hood as and when required. No boiler plate code. Things just look 
 great by default.

 Briefly, here is how Escher works under the hood:

 - A UI is an immutable Julia value that is converted to a Virtual DOM 
 http://en.wikipedia.org/wiki/React_%28JavaScript_library%29#Virtual_DOM 
 using the Patchwork https://github.com/shashi/Patchwork.jl library.
   Compose graphics and Gadfly plots also get rendered to Virtual DOM as 
 well.
 - Subsequent updates to a UI are sent as patches to the current UI over a 
 websocket connection
 - Input widgets send messages to the server over the same websocket 
 connection
 - Complex things like tabs, slideshows, code editor, TeX and dropdown 
 menus are set up as custom HTML elements using the Web Component 
 http://webcomponents.org/ specification, mainly using the Polymer 
 library http://polymer-project.org/. These things are just Virtual DOM 
 nodes in the end.


 This is still a work in progress, I am very happy to receive any critique 
 on this thread, and bug reports on Github 
 https://github.com/shashi/Escher.jl. I am very excited to see what you 
 will create with Escher.

 Thanks! :)
 Shashi

 * - Escher uses some bleeding-edge Web features, this page might not work 
 so well on Safari, should work well on a decently new Chrome, and on 
 Firefox if you wait for long enough for all the assets to load. I will be 
 fixing this in due time, and also working on a cross-browser testing 
 framework.

 PS: I have been dealing with RSI issues of late and my hands will 
 appreciate any help with expanding the documentation! See 
 https://github.com/shashi/Escher.jl/issues/26 if you wish to help.

[julia-users] Re: Roadmap for 0.4?

2015-05-21 Thread Steven Sagaert

Any estimate when 0.4 will be available? I saw june 2015 in Github but is 
this realistic?

On Tuesday, July 29, 2014 at 6:41:29 PM UTC+2, D johnson wrote:

 I saw the Roadmap for 0.3 here: 
 https://github.com/JuliaLang/julia/issues/4853

 But I cannot find the Roadmap for 0.4...  Does anyone know where that is 
 located?

 thx

Re: [julia-users] Suspending Garbage Collection for Performance...good idea or bad idea?

2015-05-15 Thread Steven Sagaert

I'd say that manual memory management is usually going to be faster than GC 
unless you have really bad manual management and a very good GC. The best a 
good GC can hope for is to be close to manual management. That's one of the 
reasons the majority of systems software is still in C/C++ (memory layout 
is another).  Note that not all GC's are created equal: the performance of 
GC's can range from terrible to excellent depending on the algo  the 
tuning. The Oracle JVM alone has several GC algos built-in each with its 
advantages  disadvantages.

On Wednesday, May 13, 2015 at 6:58:24 PM UTC+2, Páll Haraldsson wrote:

 On Monday, May 11, 2015 at 10:03:20 PM UTC, Michael Louwrens wrote:

 I am starting to read Region-Based Memory Management for a 
 Dynamically-Typed Language 
 http://link.springer.com/content/pdf/10.1007/b102225.pdf#page=240 it 
 proposes a second inference system, region inference.


 Interesting, I just scanned the paper down to Table 1. I see GC is in all 
 ten benchmark cases faster than Region based. The heap size is usually 
 larger (not always! Can be order of magnitude larger for Region) for GC, 
 so there is/can be a time-space trade-off.

 Should I expect GC (I assume this GC is similar to Julia's) to always be 
 faster for manual memory management (such as in C/C++)? This region-based 
 is not the same/similar as in C/C++?

 GC has to do the same allocations (and deallocations - except in a 
 degenerate case - closing program..) as manual memory management - at I 
 expect the same speed, noting:

 GC *seems* to have an overhead as it has to scan the heap, but that is 
 overblown as a drawback, right? As with larger memories that overhead will 
 be arbitrarily reduced? [Not taking caching effects into account.. The 
 memory itself will not cost you anything (in a single application scenario) 
 as RAM burns energy whether it is used or not.] And compared to C/C++ you 
 would use less redundant copying. How much necessary copying is there, 
 really..?


 I do not worry to much about hard real-time, GC seems to only have 
 drawbacks with stuttering (and not so much with better GC algorithms). Even 
 for games that are only soft-real-time wouldn't it be ok? Already as is (in 
 0.4)? It is not clear to me that something other than GC would be helpful 
 there (one of the benchmarks was ray-tracing), as you could force GC on 
 vblank, and in next-gen ray-tracing, vblank/fps isn't that important..

 Besides, if you only work on datastructures that are 
 static/updated-in-place, there shouldn't be much GC activity?

 -- 
 Palli.

Re: [julia-users] Suspending Garbage Collection for Performance...good idea or bad idea?

2015-05-12 Thread Steven Sagaert

if you take each heap region and give them their own garbage collector and 
assign each thread/proces/fiber one, then you get another somewhat 
related approach for soft real time performance in GC'd languages (it's 
both for performance and safety  parallelism) like you have in 
Erlang/Elixir (dynamically typed  JITed/interpreted) or Nim (AOT, 
statically typed).

On Tuesday, May 12, 2015 at 12:03:20 AM UTC+2, Michael Louwrens wrote:

 I am starting to read Region-Based Memory Management for a 
 Dynamically-Typed Language 
 http://link.springer.com/content/pdf/10.1007/b102225.pdf#page=240 it 
 proposes a second inference system, region inference.

 I will read it fully in the morning but just scanning through 
 their results they compare their regions to a GC. The GC library uses far 
 more heap in most cases, though the region based system needs optimisation 
 to be competitive.

 At one point they do state a combination of region based memory management 
 and GC would be interesting. 

 For a prototype implementation, being 2x-3x slower while mostly using far 
 less memory is quite successful. The Div benchmark from Gabriel Scheme 
 benchmarks was the most impressive in terms of heap usage using 32.2KB vs. 
 1219.6 for the GC'd version. In a memory constrained system this would be 
 an interesting thing to look at, the outliers are a bit of a concern 
 though. The Tak and Destruct benchmarks use almost 10x the amount of 
 heap the GC did.

 If anything it was an interesting read. The emulated region based 
 management sounds quite interesting in fact. Will go read up on the two 
 Steven Sagaert mentioned. Haven't read too much about G1 and nothing at all 
 on Azul Zing!

[julia-users] What's the reasoning to have 2 different import mechanisms: using vs import?

2015-05-12 Thread Steven Sagaert

As far as I can tell using is almost like import except with import you can 
extend the functions and with using not (but then with using module you 
also can extend them???) and there are some differences in name resolution 
(fully qualified or not).

Is it a performance optimization (reducing the search space for method 
resolution)? But then why also allow extension for using module ?

Are there any plans to come to one mechanism in the future?

Re: [julia-users] Suspending Garbage Collection for Performance...good idea or bad idea?

2015-05-11 Thread Steven Sagaert

Isn't that similar to smart pointers/automatic resource management in C++?

On Tuesday, December 16, 2014 at 10:24:08 PM UTC+1, Stefan Karpinski wrote:

 I would love to figure out a way to bring the kind of automatic resource 
 and memory release that Rust has to Julia, but the cost is some fairly 
 finicky static compiler analysis that is ok for Rust's target demographic 
 but fairly clearly unacceptable for Julia general audience. What we'd need 
 is a more dynamic version of something like that. One idea I've had is to 
 indicate ownership and enforce it at run-time rather than compile time – 
 and eliminate run-time checks when we can prove that they aren't needed. 
 This could look something like having owned references to values versus 
 borrowed references and check that the object still exists if the 
 borrowed reference is used and raise an error if it doesn't. When an owned 
 reference to the thing goes out of scope, immediately finalize it. I'm not 
 sure how to indicate this, but it would be great to be able to write:

 function foo(...)
 fh = open(file)
 # do stuff with fh
 end # = fh is automatically before the function returns

 # it's a runtime error to access the fh object after this


 Similarly, you could have something like this:

 function bar(...)
 a = @local Array(Int,10)
 # do stuff with a
 end # = a is automatically freed before the function returns
 # it's a runtime error to access the a object after this


 Given these semantics, it would be relatively easy to alloca the actual 
 memory of the array, and only heap allocate the object itself, which could 
 then reference the stack allocated memory. This is tough to implement, 
 especially efficiently, but I have a bit of a suspicion that in Julia 
 mutable objects – and this only makes sense for mutable objects that are 
 inherently associated with a particular place in memory – are rarely 
 performance critical in Julia.


 On Mon, Dec 15, 2014 at 11:15 PM, John Myles White johnmyl...@gmail.com 
 javascript: wrote:

 This is taking the thread off-topic, but conceptually such things are 
 possible. But Rust has a very different set of semantics for memory 
 ownership than Julia has and is doing a lot more analysis at compile-time 
 than Julia is doing. So Julia would need to change a lot to be more like 
 Rust. I've come to really adore Rust, so I'd like to see us borrow some 
 ideas, but my personal sense is that Julia and Rust simply serve different 
 niches and shouldn't really move towards one another too much lest each 
 language wind up forsaking what makes it useful.

  -- John


 On Dec 15, 2014, at 8:43 PM, Eric Forgy eric@gmail.com javascript: 
 wrote:

 Hi,

 I'm new to Julia and mentioned it to a friend who is more into systems 
 than mathematical models and he mentioned his current crush is Rust, 
 which is also built on LVVM. I may have totally missed the point, but IF I 
 understand, Rust does away with garbage collection by borrow blocking at 
 compile time. The question popped into my head whether we could turn off GC 
 in Julia and check for problems at compile time. A google later, brought me 
 to this thread. Is that a totally misguided idea?

 Best regards,
 Eric

 PS: You can tell I'm coming in with almost no background knowledge about 
 compilers (or real languages for that matter), but am having fun learning. 
 LVVM was developed at my alma mater (PhD in ECE - Computational 
 Electromagnetics - from UIUC 2002). Go Illini! :)

 On Friday, February 22, 2013 7:11:32 PM UTC+8, Tim Holy wrote:

 Have you played with SProfile in the Profile package? It's rather good 
 at 
 highlighting which lines, in your code and in base/, trigger the gc. 
 Note that 
 in my experience the gc does not seem to be triggered necessarily on big 
 allocations; for example, even allocating an array as 
Array(Int, (3,5)) 
 rather than 
   Array(Int, 3, 5) 
 can trigger the gc (I see lots of gc() calls coming from our Lapack code 
 for 
 this reason). 

 Because I don't really know how the gc works, I'm not certain that kind 
 of 
 thing actually reflects a problem; perhaps it was just going to have to 
 call gc 
 on the next heap-allocation event, and (3,5) just happened to be the 
 lucky 
 candidate. But there's an open issue about this: 
 https://github.com/JuliaLang/julia/issues/1976 

 Docs are here: https://github.com/timholy/Profile.jl 
 I think they're slightly out of date, but only in very minor ways. 

 --Tim 



 On Thursday, February 21, 2013 03:17:59 PM nathan hodas wrote: 
  Here's the code that benefits from @nogc: 
  
  Notice the iteration over a Dict and a Set. iscorrect() checks a field 
 of 
  the Attempt type. I can tell by running this particular that the 
 garbage 
  collection is running during the for loop. 
  function meantime(userdata::Dict{Int,Set{Attempt}}) 
  usertimes = Dict{Int,Float64}() 
  sizehint(usertimes,length(userdata)) 
  for (uid,attempts) in

Re: [julia-users] Suspending Garbage Collection for Performance...good idea or bad idea?

2015-05-11 Thread Steven Sagaert


Rust isn't the only language to use such ideas. Basically it's region based 
memory 
management http://en.wikipedia.org/wiki/Region-based_memory_management. 
Real time Java uses this. For a recent development next to Rust, check out 
ParaSail https://forge.open-do.org/plugins/moinmoin/parasail/. There's even 
now a version based on LLVM so closer to home for Julia than Rust.

On Sunday, May 10, 2015 at 8:20:42 PM UTC+2, Stefan Karpinski wrote:

 No, there hasn't been any change on this. It's unclear if anything from 
 the Rust model can actually be leveraged in a dynamic language.


 On May 9, 2015, at 3:46 PM, Michael Louwrens michael.w...@outlook.com 
 javascript: wrote:

 Have you had any further thought on this? It seems like it could be quite 
 useful for the cases where one intentionally disables the GC for 
 performance reasons - though you guys are incredibly busy! I also read 
 about having the compiler automatically insert frees where it can in Julia 
 and was wondering if that fits at all into this?

 On Tuesday, 16 December 2014 23:24:08 UTC+2, Stefan Karpinski wrote:

 I would love to figure out a way to bring the kind of automatic resource 
 and memory release that Rust has to Julia, but the cost is some fairly 
 finicky static compiler analysis that is ok for Rust's target demographic 
 but fairly clearly unacceptable for Julia general audience. What we'd need 
 is a more dynamic version of something like that. One idea I've had is to 
 indicate ownership and enforce it at run-time rather than compile time – 
 and eliminate run-time checks when we can prove that they aren't needed. 
 This could look something like having owned references to values versus 
 borrowed references and check that the object still exists if the 
 borrowed reference is used and raise an error if it doesn't. When an owned 
 reference to the thing goes out of scope, immediately finalize it. I'm not 
 sure how to indicate this, but it would be great to be able to write:

 function foo(...)
 fh = open(file)
 # do stuff with fh
 end # = fh is automatically before the function returns

 # it's a runtime error to access the fh object after this


 Similarly, you could have something like this:

 function bar(...)
 a = @local Array(Int,10)
 # do stuff with a
 end # = a is automatically freed before the function returns
 # it's a runtime error to access the a object after this


 Given these semantics, it would be relatively easy to alloca the actual 
 memory of the array, and only heap allocate the object itself, which could 
 then reference the stack allocated memory. This is tough to implement, 
 especially efficiently, but I have a bit of a suspicion that in Julia 
 mutable objects – and this only makes sense for mutable objects that are 
 inherently associated with a particular place in memory – are rarely 
 performance critical in Julia.


 On Mon, Dec 15, 2014 at 11:15 PM, John Myles White johnmyl...@gmail.com 
 wrote:

 This is taking the thread off-topic, but conceptually such things are 
 possible. But Rust has a very different set of semantics for memory 
 ownership than Julia has and is doing a lot more analysis at compile-time 
 than Julia is doing. So Julia would need to change a lot to be more like 
 Rust. I've come to really adore Rust, so I'd like to see us borrow some 
 ideas, but my personal sense is that Julia and Rust simply serve different 
 niches and shouldn't really move towards one another too much lest each 
 language wind up forsaking what makes it useful.

  -- John


 On Dec 15, 2014, at 8:43 PM, Eric Forgy eric@gmail.com wrote:

 Hi,

 I'm new to Julia and mentioned it to a friend who is more into systems 
 than mathematical models and he mentioned his current crush is Rust, 
 which is also built on LVVM. I may have totally missed the point, but IF I 
 understand, Rust does away with garbage collection by borrow blocking at 
 compile time. The question popped into my head whether we could turn off GC 
 in Julia and check for problems at compile time. A google later, brought me 
 to this thread. Is that a totally misguided idea?

 Best regards,
 Eric

 PS: You can tell I'm coming in with almost no background knowledge about 
 compilers (or real languages for that matter), but am having fun learning. 
 LVVM was developed at my alma mater (PhD in ECE - Computational 
 Electromagnetics - from UIUC 2002). Go Illini! :)

 On Friday, February 22, 2013 7:11:32 PM UTC+8, Tim Holy wrote:

 Have you played with SProfile in the Profile package? It's rather good 
 at 
 highlighting which lines, in your code and in base/, trigger the gc. 
 Note that 
 in my experience the gc does not seem to be triggered necessarily on 
 big 
 allocations; for example, even allocating an array as 
Array(Int, (3,5)) 
 rather than 
   Array(Int, 3, 5) 
 can trigger the gc (I see lots of gc() calls coming from our Lapack 
 code for 
 this reason). 

 Because I don't really know how the gc works,

Re: [julia-users] Suspending Garbage Collection for Performance...good idea or bad idea?

2015-05-11 Thread Steven Sagaert

I guess you can approximate/emulate the region based memory management in a 
GC'd language by dividing the heap into many small region and run GC over 
all these regions regularly. That's what the G1 GC in Java does and see 
also Azul Zing for soft real time  high performance Java.

On Monday, May 11, 2015 at 2:21:58 PM UTC+2, Steven Sagaert wrote:


 Rust isn't the only language to use such ideas. Basically it's region 
 based memory management 
 http://en.wikipedia.org/wiki/Region-based_memory_management. Real time 
 Java uses this. For a recent development next to Rust, check out ParaSail 
 https://forge.open-do.org/plugins/moinmoin/parasail/. There's even now a 
 version based on LLVM so closer to home for Julia than Rust.

 On Sunday, May 10, 2015 at 8:20:42 PM UTC+2, Stefan Karpinski wrote:

 No, there hasn't been any change on this. It's unclear if anything from 
 the Rust model can actually be leveraged in a dynamic language.


 On May 9, 2015, at 3:46 PM, Michael Louwrens michael.w...@outlook.com 
 wrote:

 Have you had any further thought on this? It seems like it could be quite 
 useful for the cases where one intentionally disables the GC for 
 performance reasons - though you guys are incredibly busy! I also read 
 about having the compiler automatically insert frees where it can in Julia 
 and was wondering if that fits at all into this?

 On Tuesday, 16 December 2014 23:24:08 UTC+2, Stefan Karpinski wrote:

 I would love to figure out a way to bring the kind of automatic resource 
 and memory release that Rust has to Julia, but the cost is some fairly 
 finicky static compiler analysis that is ok for Rust's target demographic 
 but fairly clearly unacceptable for Julia general audience. What we'd need 
 is a more dynamic version of something like that. One idea I've had is to 
 indicate ownership and enforce it at run-time rather than compile time – 
 and eliminate run-time checks when we can prove that they aren't needed. 
 This could look something like having owned references to values versus 
 borrowed references and check that the object still exists if the 
 borrowed reference is used and raise an error if it doesn't. When an owned 
 reference to the thing goes out of scope, immediately finalize it. I'm not 
 sure how to indicate this, but it would be great to be able to write:

 function foo(...)
 fh = open(file)
 # do stuff with fh
 end # = fh is automatically before the function returns

 # it's a runtime error to access the fh object after this


 Similarly, you could have something like this:

 function bar(...)
 a = @local Array(Int,10)
 # do stuff with a
 end # = a is automatically freed before the function returns
 # it's a runtime error to access the a object after this


 Given these semantics, it would be relatively easy to alloca the actual 
 memory of the array, and only heap allocate the object itself, which could 
 then reference the stack allocated memory. This is tough to implement, 
 especially efficiently, but I have a bit of a suspicion that in Julia 
 mutable objects – and this only makes sense for mutable objects that are 
 inherently associated with a particular place in memory – are rarely 
 performance critical in Julia.


 On Mon, Dec 15, 2014 at 11:15 PM, John Myles White johnmyl...@gmail.com
  wrote:

 This is taking the thread off-topic, but conceptually such things are 
 possible. But Rust has a very different set of semantics for memory 
 ownership than Julia has and is doing a lot more analysis at compile-time 
 than Julia is doing. So Julia would need to change a lot to be more like 
 Rust. I've come to really adore Rust, so I'd like to see us borrow some 
 ideas, but my personal sense is that Julia and Rust simply serve different 
 niches and shouldn't really move towards one another too much lest each 
 language wind up forsaking what makes it useful.

  -- John


 On Dec 15, 2014, at 8:43 PM, Eric Forgy eric@gmail.com wrote:

 Hi,

 I'm new to Julia and mentioned it to a friend who is more into systems 
 than mathematical models and he mentioned his current crush is Rust, 
 which is also built on LVVM. I may have totally missed the point, but IF I 
 understand, Rust does away with garbage collection by borrow blocking at 
 compile time. The question popped into my head whether we could turn off 
 GC 
 in Julia and check for problems at compile time. A google later, brought 
 me 
 to this thread. Is that a totally misguided idea?

 Best regards,
 Eric

 PS: You can tell I'm coming in with almost no background knowledge 
 about compilers (or real languages for that matter), but am having fun 
 learning. LVVM was developed at my alma mater (PhD in ECE - Computational 
 Electromagnetics - from UIUC 2002). Go Illini! :)

 On Friday, February 22, 2013 7:11:32 PM UTC+8, Tim Holy wrote:

 Have you played with SProfile in the Profile package? It's rather good 
 at 
 highlighting which lines, in your code and in base/, trigger the gc

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-03 Thread Steven Sagaert

You really should ask the language designers about this for a definite 
answer but (one of the ) the reason(s) strings are immutable in julia (and 
in Java  others) is that it makes them good keys for Dicts.

On Saturday, May 2, 2015 at 7:16:24 PM UTC+2, Jameson wrote:

 IOBuffer does not inherit from string, nor does it implement any of the 
 methods expected of a mutable string (length, endof, insert! / splice! / 
 append!). If you want strings that support all of those operations, then 
 you will need something different from an IOBuffer. If you just wanted a 
 fast string builder, then IOBuffer is the right abstraction (ending with a 
 call to `takebuf_string!`). This dichotomy helps to give a clear 
 distinction in the code between the construction phase and usage phase.

 On Sat, May 2, 2015 at 12:49 PM Páll Haraldsson pall.ha...@gmail.com 
 javascript: wrote:

 2015-05-01 16:42 GMT+00:00 Steven G. Johnson steve...@gmail.com 
 javascript::


 In Julia, Ruby, Java, Go, and many other languages, concatenation 
 allocates a new string and hence building a string by repeated 
 concatenation is O(n^2).   That doesn't mean that those other languages 
 lose on string processing to Python, it just means that you have to do 
 things slightly differently (e.g. write to an IOBuffer in Julia).

 You can't always expect the *same code* (translated as literally as 
 possible) to be the optimal approach in different languages, and it is 
 inflammatory to compare languages according to this standard.

 A fairer question is whether it is *much harder* to get good performance 
 in one language vs. another for a certain task.   There will certainly be 
 tasks where Python is still superior in this sense simply because there are 
 many cases where Python calls highly tuned C libraries for operations that 
 have not been as optimized in Julia.  Julia will tend to shine the further 
 you stray from built-in operations in your performance-critical code.


 What I would like to know is do you need to make your own string type to 
 make Julia as fast (by a constant factor) to say Python. In another answer 
 IOBuffer was said to be not good enough.


 -- 
 Palli.

[julia-users] Re: the state of GUI toolkits?

2015-05-02 Thread Steven Sagaert

Another way (maybe conceptually the nicest) would be to look an how the Qt 
team integrated the QML/javascript VM with the Qt C++ code and do the 
analog thing for the julia runtime. QML would then be replaced by Julia (or 
a julia based DSL e.g. via macros).
The advantage would be that you don't have to export the julia code that 
you want to call from the GUI to Qt since the GUI is in julia.
The disadvantage is that you loose the ability to design the GUI 
graphically via the design tool.

On Friday, May 1, 2015 at 2:58:06 PM UTC+2, Tom Breloff wrote:

 Steven... can you post a summary of how you would ideally interact with 
 Qt5.jl?  I assume you'd want clean syntax to define a gui with a QML string 
 (or file), along with clean syntax to define signals/slots to connect to 
 julia callbacks.  Could you post some simple/hypothetical code that you 
 would ideally call from within julia?  This will help as I'm reading the 
 Qt5 docs.  Thanks.

 On Friday, May 1, 2015 at 4:59:43 AM UTC-4, Steven Sagaert wrote:

 I think it depends how you want to build the GUI: if you want to do it 
 old school by calling a bunch of julia functions/methods that wrap the 
 C++ methods than yes a lot of C++ classes/methods will need to be wrapped. 
 However if you stick to the new school approach i.e. QtQuick + QML then a 
 lot less C++ classes need to be wrapped and GUI construction would 
 basically be writing QML string and passing them from julia onto the Qt5 
 side which then interprets it. So it that sense it's not that different 
 from say writing an SQL library wrapper like ODBC.

 On Thursday, April 30, 2015 at 10:59:32 PM UTC+2, Max Suster wrote:

 Good to hear interest. I will also have to look at what might be a good 
 strategy for wrapping Qt5 with Cxx. The core functionality of Qt5 (shared 
 by Qt4) would be an obvious place to start.  The part that is clearly 
 daunting is the interface for event handling, namely signals and slots. Not 
 only we have to deal with/replace the underlying XML support, but also the 
 syntax has changed a lot between Qt4 and Qt5.

[julia-users] Re: the state of GUI toolkits?

2015-05-01 Thread Steven Sagaert

The idea would be to describe the GUI  it's behavior via QML/javascript. 
Any julia modules/functions (event handlers) that need to be called from 
the GUI could be exported as C++ QtObjects/methods (that's the hard part). 
These can then be called from QML. Passing the QML to Qt  interpreting it 
would be done by wrapping the QtQML C++ code with julia code. 

On Friday, May 1, 2015 at 2:58:06 PM UTC+2, Tom Breloff wrote:

 Steven... can you post a summary of how you would ideally interact with 
 Qt5.jl?  I assume you'd want clean syntax to define a gui with a QML string 
 (or file), along with clean syntax to define signals/slots to connect to 
 julia callbacks.  Could you post some simple/hypothetical code that you 
 would ideally call from within julia?  This will help as I'm reading the 
 Qt5 docs.  Thanks.

 On Friday, May 1, 2015 at 4:59:43 AM UTC-4, Steven Sagaert wrote:

 I think it depends how you want to build the GUI: if you want to do it 
 old school by calling a bunch of julia functions/methods that wrap the 
 C++ methods than yes a lot of C++ classes/methods will need to be wrapped. 
 However if you stick to the new school approach i.e. QtQuick + QML then a 
 lot less C++ classes need to be wrapped and GUI construction would 
 basically be writing QML string and passing them from julia onto the Qt5 
 side which then interprets it. So it that sense it's not that different 
 from say writing an SQL library wrapper like ODBC.

 On Thursday, April 30, 2015 at 10:59:32 PM UTC+2, Max Suster wrote:

 Good to hear interest. I will also have to look at what might be a good 
 strategy for wrapping Qt5 with Cxx. The core functionality of Qt5 (shared 
 by Qt4) would be an obvious place to start.  The part that is clearly 
 daunting is the interface for event handling, namely signals and slots. Not 
 only we have to deal with/replace the underlying XML support, but also the 
 syntax has changed a lot between Qt4 and Qt5.

[julia-users] Re: the state of GUI toolkits?

2015-05-01 Thread Steven Sagaert

One could of course let Qt/C++ be in charge as he main loop and just run 
julia as an embedded engine and expose the julia functionality one wants to 
call as QtObjects  methods so that these can be called from QML.

On Friday, May 1, 2015 at 3:49:12 PM UTC+2, Steven Sagaert wrote:

 The idea would be to describe the GUI  it's behavior via QML/javascript. 
 Any julia modules/functions (event handlers) that need to be called from 
 the GUI could be exported as C++ QtObjects/methods (that's the hard part). 
 These can then be called from QML. Passing the QML to Qt  interpreting it 
 would be done by wrapping the QtQML C++ code with julia code. 

 On Friday, May 1, 2015 at 2:58:06 PM UTC+2, Tom Breloff wrote:

 Steven... can you post a summary of how you would ideally interact with 
 Qt5.jl?  I assume you'd want clean syntax to define a gui with a QML string 
 (or file), along with clean syntax to define signals/slots to connect to 
 julia callbacks.  Could you post some simple/hypothetical code that you 
 would ideally call from within julia?  This will help as I'm reading the 
 Qt5 docs.  Thanks.

 On Friday, May 1, 2015 at 4:59:43 AM UTC-4, Steven Sagaert wrote:

 I think it depends how you want to build the GUI: if you want to do it 
 old school by calling a bunch of julia functions/methods that wrap the 
 C++ methods than yes a lot of C++ classes/methods will need to be wrapped. 
 However if you stick to the new school approach i.e. QtQuick + QML then a 
 lot less C++ classes need to be wrapped and GUI construction would 
 basically be writing QML string and passing them from julia onto the Qt5 
 side which then interprets it. So it that sense it's not that different 
 from say writing an SQL library wrapper like ODBC.

 On Thursday, April 30, 2015 at 10:59:32 PM UTC+2, Max Suster wrote:

 Good to hear interest. I will also have to look at what might be a good 
 strategy for wrapping Qt5 with Cxx. The core functionality of Qt5 (shared 
 by Qt4) would be an obvious place to start.  The part that is clearly 
 daunting is the interface for event handling, namely signals and slots. 
 Not 
 only we have to deal with/replace the underlying XML support, but also the 
 syntax has changed a lot between Qt4 and Qt5.

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Steven Sagaert

Of course I'm not saying loops should not be benchmarked and I do use loops
in julia also. I'm just saying that when doing performance comparison one
should try to write the programs in each language in their most optimal
style rather than similar style which is optimal for one language but very
suboptimal in another language.
Ah I didn't know the article was rebutted by Stefan. I read that article
before that happened and just looked it up again now as an example.

I guess the conclusion is that cross-language performance benchmarks are
very tricky which was kinda my point :)

On Friday, May 1, 2015 at 3:13:24 PM UTC+2, Tim Holy wrote:

Hi Steven,

I understand your point---you're saying you'd be unlikely to write those
algorithms in that manner, if your goal were to do those particular
computations. But the important point to keep in mind is that those
benchmarks
are simply toys for the purpose of testing performance of various
language
constructs. If you think it's irrelevant to benchmark loops for scientific
code, then you do very, very different stuff than me. Not all algorithms
reduce
to BLAS calls. I use julia to write all kinds of algorithms that I used to
write MEX functions for, back in my Matlab days. If all you need is A*b,
then
of course basically any scientific language will be just fine, with
minimal
differences in performance.

Moreover, that R benchmark on cumsum is simply not credible. I'm not sure
what
was happening (and that article doesn't post its code or procedures used
to
test), but julia's cumsum reduces to efficient machine code (basically, a
bunch
of addition operations). If they were computing cumsum across a specific
dimension, then this PR:
https://github.com/JuliaLang/julia/pull/7359
changed things. But more likely, someone forgot to run the code twice (so
it
got JIT-compiled), had a type-instability in the code they were testing,
or
some other mistake. It's too bad one can make mistakes, of course, but
then it
becomes a comparison of different programmers rather than different
programming
languages.

Indeed, if you read the comments in that post, Stefan already rebutted
that
benchmark, with a 4x advantage for Julia:

https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/comment-page-1/#comment-89

--Tim

On Friday, May 01, 2015 01:25:50 AM Steven Sagaert wrote:
I think the performance comparisons between Julia Python are flawed.
They
seem to be between standard Python Julia but since Julia is all about
scientific programming it really should be between SciPi Julia. Since
SciPi uses much of the same underlying libs in Fortran/C the performance
gap will be much smaller and to be really fair it should be between
numba
compiled SciPi code julia. I suspect the performance will be very
close
then (and close to C performance).

Similarly the standard benchmark (on the opening page of julia website)
between R julia is also flawed because it takes the best case scenario
for julia (loops mutable datastructures) the worst case scenario for
R.
When the same R program is rewritten in vectorised style it beat julia
see

https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyon
e-else-wanna-challenge-r/.

So my interest in julia isn't because it is the fastest scientific high
level language (because clearly at this stage you can't really claim
that)
but because it's a clean interesting language (still needs work for some
rough edges of course) with clean(er) clear(er) libraries and that
gives
reasonable performance out of the box without much tweaking.

On Friday, May 1, 2015 at 12:10:58 AM UTC+2, Scott Jones wrote:
Yes... Python will win on string processing... esp. with Python 3... I
quickly ran into things that were 800x faster in Python...
(I hope to help change that though!)

Scott

On Thursday, April 30, 2015 at 6:01:45 PM UTC-4, Páll Haraldsson
wrote:
I wouldn't expect a difference in Julia for code like that (didn't
check). But I guess what we are often seeing is someone comparing a
tuned
Python code to newbie Julia code. I still want it faster than that
code..
(assuming same algorithm, note row vs. column major caveat).

The main point of mine, *should* Python at any time win?

2015-04-30 21:36 GMT+00:00 Sisyphuss zhengw...@gmail.com:
This post interests me. I'll write something here to follow this
post.

The performance gap between normal code in Python and badly-written
code
in Julia is something I'd like to know too.
As far as I know, Python interpret does some mysterious
optimizations.
For example `(x**2)**2` is 100x faster than `x**4`.

On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson
wrote:
Hi,

[As a best

Re: [julia-users] Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Steven Sagaert

Scott, 
You shouldn't take my reply personal. It wasn't really about the specific 
string case you mentioned but more in general about Python julia 
performance comparisons.

On Friday, May 1, 2015 at 3:10:14 PM UTC+2, Scott Jones wrote:


 On May 1, 2015, at 8:23 AM, Steven Sagaert steven@gmail.com 
 javascript: wrote:



 On Friday, May 1, 2015 at 12:26:54 PM UTC+2, Scott Jones wrote:



 On Friday, May 1, 2015 at 4:25:50 AM UTC-4, Steven Sagaert wrote:

 I think the performance comparisons between Julia  Python are flawed. 
 They seem to be between standard Python  Julia but since Julia is all 
 about scientific programming it really should be between SciPi  Julia. 
 Since SciPi uses much of the same underlying libs in Fortran/C the 
 performance gap will be much smaller and to be really fair it should be 
 between numba compiled SciPi code  julia. I suspect the performance will 
 be very close then (and close to C performance).


 Why should Julia be limited to scientific programming?
 I think it can be a great language for general programming, 


 I agree but for now  the short time future I think the core domain of 
 julia is scientific computing/data science and so to have fair comparisons 
 one should not just compare julia to vanilla Python but  especially scipi  
 numba.


 I stated that my comparisons were of string processing… what’s unfair 
 about that?  I have no expertise to compare Julia to any scientific 
 computing system, I’ll leave that to the people here that do (and there are 
 many, very highly qualified).
 Also, even in technical computing, the performance issues I raise may be 
 of some importance, for example, issues about performance connection to a 
 database… I assume that sometimes you need to read scientific data from a 
 database, and store results to one?

 Scott

Re: [julia-users] Re: Performance of Distributed Arrays

2015-05-01 Thread Steven Sagaert

No. I've thought about maybe writing a wrapper for Spark but only after 
julia 0.4 (and the new and improved dataframes) have landed. Also depends 
how much time I could spend on it at my day job :)

On Friday, May 1, 2015 at 2:57:57 PM UTC+2, Sebastian Good wrote:

 Steven, are you working on such things at the moment? 

 On Friday, May 1, 2015 at 4:50:39 AM UTC-4, Steven Sagaert wrote:

 I'd be nice to see a distributed array implemented on top of MPI (or 
 similar high perf distribution libs) like Fortran co-arrays but since I'm 
 out of academia  and do not have access to real supercomputers anymore 
 I'm actually more interested in wrappers to cloud base distributed 
 computing frameworks like Spark (which do have distributed datastructures 
 abstraction especially distributed dataframe).

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Steven Sagaert





 Obviously a particular system might have a well-tuned library routine 
 that's faster than our equivalent. But think about it: is having a 
 slow interpreter, and relying on code to spend all its time in 
 pre-baked library kernels the *right* way to get performance? That's 
 just the same boring design that has been used over and over again, in 
 matlab, IDL, octave, R, etc. In those cases the language isn't 
 bringing much to the table, except a pile of rules about how important 
 code must still be written in C/Fortran, and how your code must be 
 vectorized or shame on you.


That wasn't what I was saying. I like the philosophy behind julia. But in 
practice (as of now) even in julia you still have to code in a certain 
style if you want very good performance and that's no different than in any 
other language. Ideally of course the compiler should be able to optimize 
the code so that different styles (e.g. functional/vectorized style vs 
imperative/loops style) gives the same performance and the programmer 
doesn't have to think about it and maybe one day it will be like that in 
julia but we're not quite there yet AFAIK.

Having said that, I like Julia and hopefully it will keep on getting 
better/faster. So good job and keep up the good work.



 On Fri, May 1, 2015 at 11:48 AM, Tim Holy tim@gmail.com javascript: 
 wrote: 
  On Friday, May 01, 2015 08:03:31 AM Scott Jones wrote: 
  Still, same issue as I described above... probably better to increase 
 by 2x 
  up to a point, and then by chunk sizes, where the chunk sizes might 
 slowly 
  get larger... 
  
  I see your point, but it will also break the O(nlogn) scaling. We 
 couldn't 
  hard-code the cutoff, because some people run julia on machines with 4GB 
 of RAM 
  and others with 1TB of RAM. So, we could query the amount of RAM 
 available and 
  switch based on that result, but since all this would only make a 
 difference 
  for operations that consume between 0.5x and 1x the user's RAM (which to 
 me 
  seems like a very narrow window, on the log scale), is it really worth 
 the 
  trouble? 
  
  --Tim

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Steven Sagaert



On Friday, May 1, 2015 at 7:23:40 PM UTC+2, Steven G. Johnson wrote:

 On Friday, May 1, 2015 at 1:12:00 PM UTC-4, Steven Sagaert wrote: 

 That wasn't what I was saying. I like the philosophy behind julia. But in 
 practice (as of now) even in julia you still have to code in a certain 
 style if you want very good performance and that's no different than in any 
 other language.


 The goal of Julia is not to be a language in which it is *impossible* to 
 write slow code, or a language in which all programming styles are equally 
 fast. 


I didn't say that was a goal of Julia but it sure  would be nice to have 
though :) but probably an impossible dream.
 

   The goal (or at least, one of the goals) is to be an expressive, 
 high-level dynamic language, in which it is also *possible* to write 
 performance-critical inner-loop code.

 That *is* different from other high-level languages, in which it is 
 typically *not* possible to write performance-critical inner-loop code 
 without dropping down to a lower-level language (C, Fortran, Cython...).   
 If you are coding exclusively in Python or R, and there isn't an optimized 
 function appropriate for the innermost loops of your task at hand, you are 
 out of luck.

like I said: I like Julia and I am rooting for it but just to play devil's 
advocate: I believe it's also a goal ( possibility) of numba to write 
c-level efficient code in Python. All you have to do add an annotation here 
and there.

[julia-users] Re: the state of GUI toolkits?

2015-05-01 Thread Steven Sagaert

I think it depends how you want to build the GUI: if you want to do it old 
school by calling a bunch of julia functions/methods that wrap the C++ 
methods than yes a lot of C++ classes/methods will need to be wrapped. 
However if you stick to the new school approach i.e. QtQuick + QML then a 
lot less C++ classes need to be wrapped and GUI construction would 
basically be writing QML string and passing them from julia onto the Qt5 
side which then interprets it. So it that sense it's not that different 
from say writing an SQL library wrapper like ODBC.

On Thursday, April 30, 2015 at 10:59:32 PM UTC+2, Max Suster wrote:

 Good to hear interest. I will also have to look at what might be a good 
 strategy for wrapping Qt5 with Cxx. The core functionality of Qt5 (shared 
 by Qt4) would be an obvious place to start.  The part that is clearly 
 daunting is the interface for event handling, namely signals and slots. Not 
 only we have to deal with/replace the underlying XML support, but also the 
 syntax has changed a lot between Qt4 and Qt5.

[julia-users] Re: the state of GUI toolkits?

2015-05-01 Thread Steven Sagaert

The advantage of doing the modern way is that you then can also use GUI 
design tools like Qt Quick designer to graphically do your GUI layout, let 
it generate QML and you can just copy paste that into your julia GUI code.

On Friday, May 1, 2015 at 10:59:43 AM UTC+2, Steven Sagaert wrote:

 I think it depends how you want to build the GUI: if you want to do it 
 old school by calling a bunch of julia functions/methods that wrap the 
 C++ methods than yes a lot of C++ classes/methods will need to be wrapped. 
 However if you stick to the new school approach i.e. QtQuick + QML then a 
 lot less C++ classes need to be wrapped and GUI construction would 
 basically be writing QML string and passing them from julia onto the Qt5 
 side which then interprets it. So it that sense it's not that different 
 from say writing an SQL library wrapper like ODBC.

 On Thursday, April 30, 2015 at 10:59:32 PM UTC+2, Max Suster wrote:

 Good to hear interest. I will also have to look at what might be a good 
 strategy for wrapping Qt5 with Cxx. The core functionality of Qt5 (shared 
 by Qt4) would be an obvious place to start.  The part that is clearly 
 daunting is the interface for event handling, namely signals and slots. Not 
 only we have to deal with/replace the underlying XML support, but also the 
 syntax has changed a lot between Qt4 and Qt5.

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Steven Sagaert

I think the performance comparisons between Julia Python are flawed. They
seem to be between standard Python Julia but since Julia is all about
scientific programming it really should be between SciPi Julia. Since
SciPi uses much of the same underlying libs in Fortran/C the performance
gap will be much smaller and to be really fair it should be between numba
compiled SciPi code julia. I suspect the performance will be very close
then (and close to C performance).

Similarly the standard benchmark (on the opening page of julia website)
between R julia is also flawed because it takes the best case scenario
for julia (loops mutable datastructures) the worst case scenario for R.
When the same R program is rewritten in vectorised style it beat julia
see
https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/.

So my interest in julia isn't because it is the fastest scientific high
level language (because clearly at this stage you can't really claim that)
but because it's a clean interesting language (still needs work for some
rough edges of course) with clean(er) clear(er) libraries and that gives
reasonable performance out of the box without much tweaking.

On Friday, May 1, 2015 at 12:10:58 AM UTC+2, Scott Jones wrote:

Yes... Python will win on string processing... esp. with Python 3... I
quickly ran into things that were 800x faster in Python...
(I hope to help change that though!)

Scott

On Thursday, April 30, 2015 at 6:01:45 PM UTC-4, Páll Haraldsson wrote:

I wouldn't expect a difference in Julia for code like that (didn't
check). But I guess what we are often seeing is someone comparing a tuned
Python code to newbie Julia code. I still want it faster than that code..
(assuming same algorithm, note row vs. column major caveat).

The main point of mine, *should* Python at any time win?

2015-04-30 21:36 GMT+00:00 Sisyphuss zhengw...@gmail.com:

This post interests me. I'll write something here to follow this post.

The performance gap between normal code in Python and badly-written code
in Julia is something I'd like to know too.
As far as I know, Python interpret does some mysterious optimizations.
For example `(x**2)**2` is 100x faster than `x**4`.

On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote:

Hi,

[As a best language is subjective, I'll put that aside for a moment.]

Part I.

The goal, as I understand, for Julia is at least within a factor of two
of C and already matching it mostly and long term beating that (and C++).
[What other goals are there? How about 0.4 now or even 1.0..?]

While that is the goal as a language, you can write slow code in any
language and Julia makes that easier. :) [If I recall, Bezanson mentioned
it (the global problem) as a feature, any change there?]

I've been following this forum for months and newbies hit the same
issues. But almost always without fail, Julia can be speed up (easily as
Tim Holy says). I'm thinking about the exceptions to that - are there any
left? And about the first code slowness (see Part II).

Just recently the last two flaws of Julia that I could see where fixed:
Decimal floating point is in (I'll look into the 100x slowness, that is
probably to be expected of any language, still I think may be a
misunderstanding and/or I can do much better). And I understand the tuple
slowness has been fixed (that was really the only core language defect).
The former wasn't a performance problem (mostly a non existence problem
and
correctness one (where needed)..).

Still we see threads like this one recent one:

https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw
It seems changing the order of nested loops also helps

Obviously Julia can't beat assembly but really C/Fortran is already
close enough (within a small factor). The above row vs. column major
(caching effects in general) can kill performance in all languages.
Putting
that newbie mistake aside, is there any reason Julia can be within a small
factor of assembly (or C) in all cases already?

Part II.

Except for caching issues, I still want the most newbie code or
intentionally brain-damaged code to run faster than at least
Python/scripting/interpreted languages.

Potential problems (that I think are solved or at least not problems in
theory):

1. I know Any kills performance. Still, isn't that the default in
Python (and Ruby, Perl?)? Is there a good reason Julia can't be faster
than
at least all the so-called scripting languages in all cases (excluding
small startup overhead, see below)?

2. The global issue, not sure if that slows other languages down, say
Python. Even if it doesn't, should Julia be slower than Python because of
global?

3. Garbage collection. I do not see that as a problem, incorrect?
Mostly performance variability ([3D] games - subject for another post,
as
I'm not sure

Re: [julia-users] Re: Performance of Distributed Arrays

2015-05-01 Thread Steven Sagaert

I'd be nice to see a distributed array implemented on top of MPI (or 
similar high perf distribution libs) like Fortran co-arrays but since I'm 
out of academia  and do not have access to real supercomputers anymore 
I'm actually more interested in wrappers to cloud base distributed 
computing frameworks like Spark (which do have distributed datastructures 
abstraction especially distributed dataframe).

On Friday, May 1, 2015 at 4:20:13 AM UTC+2, Jake Bolewski wrote:

 Yes, performance will be largely the same on 0.4.

 If you have to do any performance sensitive code at scale MPI is really 
 the only option I can recomend now.  I don't know what you are trying to do 
 but the MPI.jl library is a bit incomplete so it would be great if you used 
 it and could contribute back in some way.  All the basic operations should 
 be covered.

 -Jake

 On Thursday, April 30, 2015 at 12:29:15 PM UTC-4, Ángel de Vicente wrote:

 Hi Jake, 

 Jake Bolewski jakebo...@gmail.com writes: 
  DistributedArray performance is pretty bad.  The reason for removing 
  them from base was to spur their development.  All I can say at this 
  time is that we are actively working on making their performance 
  better. 

 OK, thanks. Should I try with the DistributedArray package in 0.4-dev or 
 for the moment the performance will be similar? 


  For every parallel program you have implicit serial overhead (this is 
  especially true with multiprocessing).  The fraction of serial work to 
  parallel work determines your potential parallel speedup.  The 
  parallel work / serial overhead in this case is really bad, so I don't 
  think your observation is really surprising.  If this is on a shared 
  memory machine I would try using SharedArray's as the serial 
  communication overhead will be lower, and the potential parallel 
  speedup much higher.  DistributedArrays only really make sense if they 
  are in fact distributed over multiple machines. 

 I will try SharedArray's, but the goal is to be able to run the code 
 (not this one :-)) over distributed machines. For the moment my only 
 hope is MPI.jl then? 

 Thanks, 
 -- 
 Ángel de Vicente 
 http://www.iac.es/galeria/angelv/

Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?

2015-05-01 Thread Steven Sagaert

On Friday, May 1, 2015 at 12:26:54 PM UTC+2, Scott Jones wrote:

On Friday, May 1, 2015 at 4:25:50 AM UTC-4, Steven Sagaert wrote:

I think the performance comparisons between Julia Python are flawed.
They seem to be between standard Python Julia but since Julia is all
about scientific programming it really should be between SciPi Julia.
Since SciPi uses much of the same underlying libs in Fortran/C the
performance gap will be much smaller and to be really fair it should be
between numba compiled SciPi code julia. I suspect the performance will
be very close then (and close to C performance).

Why should Julia be limited to scientific programming?
I think it can be a great language for general programming,

I agree but for now the short time future I think the core domain of
julia is scientific computing/data science and so to have fair comparisons
one should not just compare julia to vanilla Python but especially scipi
numba.

for the most part, I think it already is (it can use some changes for
string handling [I'd like to work on that ;-)], decimal floating point
support [that is currently being addressed, kudos to Steven G. Johnson],
maybe some better language constructs to allow better software engineering
practices [that is being hotly debated!], and definitely a real debugger [I
think keno is working on that]).

Comparing Julia to Python for general computing is totally valid and
interesting.
Comparing Julia to SciPy for scientific computing is also totally valid
and interesting.

Similarly the standard benchmark (on the opening page of julia website)
between R julia is also flawed because it takes the best case scenario
for julia (loops mutable datastructures) the worst case scenario for R.
When the same R program is rewritten in vectorised style it beat julia see
https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/
.

So my interest in julia isn't because it is the fastest scientific high
level language (because clearly at this stage you can't really claim that)
but because it's a clean interesting language (still needs work for some
rough edges of course) with clean(er) clear(er) libraries and that gives
reasonable performance out of the box without much tweaking.

[julia-users] Re: the state of GUI toolkits?

2015-05-01 Thread Steven Sagaert

It's been a long since I looked at Gtk in any detail so I kind of forgot 
about Glade but after taking a quick look it seems the Glade XML is only 
about the structure and not about behaviour. QML also also allows to 
describe behaviour so more versatile AFAIK.

On Friday, May 1, 2015 at 12:31:16 PM UTC+2, Andreas Lobinger wrote:

 just for the record:

 On Friday, May 1, 2015 at 11:07:53 AM UTC+2, Steven Sagaert wrote:

 The advantage of doing the modern way is that you then can also use GUI 
 design tools like Qt Quick designer to graphically do your GUI layout, let 
 it generate QML and you can just copy paste that into your julia GUI code.


 gtkBuilder + glade exist:
  https://developer.gnome.org/gtk3/stable/GtkBuilder.html
  https://glade.gnome.org/

 for some time...

[julia-users] Re: the state of GUI toolkits?

2015-04-29 Thread Steven Sagaert

I'd love to see a Qt5/QML wrapper. I find Qt5 superior to Gtk. Also it's 
available on more platforms (mobile).

On Tuesday, April 28, 2015 at 9:46:52 AM UTC+2, Andreas Lobinger wrote:

 Hello colleagues,

 what is status of availability and usecases for GUI toolkits.

 I see Tk and Gtk on the pkg.julialang.org. Gtk has the tag 'doesn't load' 
 from testing, Tk seems OK.
 In a recent discussion here, Tim Holy mentioned himself tesing Qwt and Qt 
 in general seem to be a testcase for Cxx.

 Do i miss something here?

 Wishing a happy day,
  Andreas

[julia-users] Re: the state of GUI toolkits?

2015-04-29 Thread Steven Sagaert

There is a BIG difference between Qt4  Qt5. Qt4 is old school: the whole 
GU has to be programmed in C++. Qt5 is new school: you can declaratively 
specify the GUI  behavior via QML (basically javascript)  QtQuick 
classes. See https://wiki.qt.io/Qt_5. The old school way of doing 
everything in C++ is still available though.

Note that e.g. Ubuntu who was hardcore Gtk has also switched to Qt5 for 
their new versions because of the greater power/flexibility/platform range 
of Qt5 vs Gtk.

On Wednesday, April 29, 2015 at 3:30:09 PM UTC+2, Tom Breloff wrote:

 I'm curious... what are the advantages of Qt5 over Qt4?  Is there 
 functionality missing from Qt4?

 On Wednesday, April 29, 2015 at 3:52:40 AM UTC-4, Steven Sagaert wrote:

 I'd love to see a Qt5/QML wrapper. I find Qt5 superior to Gtk. Also it's 
 available on more platforms (mobile).

 On Tuesday, April 28, 2015 at 9:46:52 AM UTC+2, Andreas Lobinger wrote:

 Hello colleagues,

 what is status of availability and usecases for GUI toolkits.

 I see Tk and Gtk on the pkg.julialang.org. Gtk has the tag 'doesn't 
 load' from testing, Tk seems OK.
 In a recent discussion here, Tim Holy mentioned himself tesing Qwt and 
 Qt in general seem to be a testcase for Cxx.

 Do i miss something here?

 Wishing a happy day,
  Andreas

[julia-users] Re: Julia and Spark

2015-04-16 Thread Steven Sagaert

yes that's a solid approach. For my personal julia - java integrations I 
also run the JVM in a separate process.

On Wednesday, April 15, 2015 at 9:30:28 PM UTC+2, wil...@gmail.com wrote:

 1) simply wrap the Spark java API via JavaCall. This is the low level 
 approach. BTW I've experimented with javaCall and found it was unstable  
 also lacking functionality (e.g. there's no way to shutdown the jvm or 
 create a pool of JVM analogous to DB connections) so that might need some 
 work before trying the Spark integration.


 Using JavaCall is not an option, especially when JVM became close-sourced, 
 see https://github.com/aviks/JavaCall.jl/issues/7.

 Python bindings are done through Py4J, which is RPC to JVM. If you look at 
 the sparkR https://github.com/apache/spark/tree/master/R, it is done in 
 a same way. sparkR uses a RPC interface to communicate with a Netty-based 
 Spark JVM backend that translates R calls into JVM calls, keeps 
 SparkContext on a JVM side, and ships serialized data to/from R.

 So it is just a matter of writing Julia RPC to JVM and wrapping necessary 
 Spark methods in a Julia friendly way.

[julia-users] Re: Julia Installation Conflict with R

2015-04-16 Thread Steven Sagaert

Besides using the R distrib from revolutionanalytics.com (which is based on 
intel MKL) you could also completely isolate R  julia  their dependencies 
by running them in separate Docker containers. 

On Tuesday, April 14, 2015 at 1:37:52 AM UTC+2, Yudong Ma wrote:

 Hi.
 I am pretty new to Julia, and I did manage to install Julia on Ubuntu 
 precise 64.
 Everything works except that the installation of Julia updates some 
 libraries and these updates makes the R shared lib /usr/lib/libR.so 
 complain that the symbol _pcre_valid_utf8 is undefined.

 The libraries updated by Julia that affect R are libpcre3, libpcrecpp0, 
 libpcre3-dev.

 I am wondering have any of julia users have encounted this issue, and how 
 should I resolve this issue?
 Best Regards

[julia-users] Re: Julia Installation Conflict with R

2015-04-16 Thread Steven Sagaert

Docker is actually pretty lightweight (compared to VMs). I'm personally 
looking forward to the day that Docker based desktop OSes become available 
(something like Snappy Ubuntu Core but not only for cloud  embedded but 
also for desktop). That solves dependency hell once and for all.

On Thursday, April 16, 2015 at 2:51:35 PM UTC+2, Tony Kelman wrote:

 Docker's extremely useful for some things, but also overkill here. It 
 sounds like Yudong (hi BTW, didn't know you used R at all) found a 
 perfectly good solution to the problem in the duplicated thread (probably 
 wasn't showing up immediately if the thread needed moderator approval) at 
 https://groups.google.com/forum/#!topic/julia-users/rK6gSIM822w - upgrade 
 R to a version that is compatible with the same pcre version as Julia. 
 Wasn't really a problem with Julia so much as a problem with R not having 
 upper-bound constraints on its pcre version requirement.

 -Tony


 On Thursday, April 16, 2015 at 12:40:50 AM UTC-7, Steven Sagaert wrote:

 Besides using the R distrib from revolutionanalytics.com (which is based 
 on intel MKL) you could also completely isolate R  julia  their 
 dependencies by running them in separate Docker containers. 

 On Tuesday, April 14, 2015 at 1:37:52 AM UTC+2, Yudong Ma wrote:

 Hi.
 I am pretty new to Julia, and I did manage to install Julia on Ubuntu 
 precise 64.
 Everything works except that the installation of Julia updates some 
 libraries and these updates makes the R shared lib /usr/lib/libR.so 
 complain that the symbol _pcre_valid_utf8 is undefined.

 The libraries updated by Julia that affect R are libpcre3, libpcrecpp0, 
 libpcre3-dev.

 I am wondering have any of julia users have encounted this issue, and 
 how should I resolve this issue?
 Best Regards

[julia-users] Re: Julia and Spark

2015-04-15 Thread Steven Sagaert

I've been comtemplating writing a high level wrapper to Spark myself since
I'm interested in both Julia Spark but I was waiting for Julia 0.4 to
finalize before even starting.
One can do the integration on several levels:
1) simply wrap the Spark java API via JavaCall. This is the low level
approach. BTW I've experimented with javaCall and found it was unstable
also lacking functionality (e.g. there's no way to shutdown the jvm or
create a pool of JVM analogous to DB connections) so that might need some
work before trying the Spark integration.
2) Spark 1.3 has now new and high level interfaces: dataframe API for
accessing data in the form of distributed dataframes pipeline API to
compose algo via pipeline framework. By wrapping the spark dataframe with
julia dataframe you would quickly have a high level (data scientist level)
interface to Spark. BTW Spark dataframes are actually also FASTER than the
more low level approaches like java/scala methods calls or Spark SQL
(intermediate level) because Spark itself can do more optimizations (this
is similar to how PyData Blaze works). By wrapping the pipeline API one
could quickly compose Spark algos to create new algos.
3) for an intermediate approach : wrap the Spark SQL API and use SQL to
query the system.

Personally I would start with dataframe pipeline API. Maybe later on if
needed add Spark SQL API and only do the low level stuff last if needed.
But before interfacing Spark dataframes with julia ones the julia dataframe
should become more powerful: at least and || should be allowed in
indexing for richer querying like in R dataframes.

On Wednesday, April 15, 2015 at 11:37:50 AM UTC+2, Tanmay K. Mohapatra
wrote:

This thread is to discuss Julia - Spark integration further.

This is a continuation of discussions from
https://groups.google.com/forum/#!topic/julia-users/LeCnTmOvUbw (the
thread topic was misleading and we could not change it).

To summarize briefly, here are a few interesting packages:
- https://github.com/d9w/Spark.jl
- https://github.com/jey/Spock.jl
- https://github.com/benhamner/MachineLearning.jl
https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Fbenhamner%2FMachineLearning.jlsa=Dsntz=1usg=AFQjCNEBun6ioX809NFBqVDu3eMKWzrZBQ
- packages at https://github.com/JuliaParallel

We can discuss approaches and coordinate efforts towards whichever looks
promising.

[julia-users] Re: Julia Installation Conflict with R

2015-04-15 Thread Steven Sagaert

I have both julia 0.3.7  R v 3.1.2 on the same Ubuntu 14.04.  I first 
installed julia  later on R. The R distrib is however the one from 
Revolution R Open 8.0.2 beta, not the standard one. Seems to work fine.

On Tuesday, April 14, 2015 at 1:37:52 AM UTC+2, Yudong Ma wrote:

 Hi.
 I am pretty new to Julia, and I did manage to install Julia on Ubuntu 
 precise 64.
 Everything works except that the installation of Julia updates some 
 libraries and these updates makes the R shared lib /usr/lib/libR.so 
 complain that the symbol _pcre_valid_utf8 is undefined.

 The libraries updated by Julia that affect R are libpcre3, libpcrecpp0, 
 libpcre3-dev.

 I am wondering have any of julia users have encounted this issue, and how 
 should I resolve this issue?
 Best Regards

[julia-users] does pca() center the data?

2015-04-06 Thread Steven Sagaert

does pca() center the input  output data or do you have to do that 
yourself?

Re: [julia-users] does pca() center the data?

2015-04-06 Thread Steven Sagaert

the one from the standard lib

On Monday, April 6, 2015 at 4:01:00 PM UTC+2, Andreas Noack wrote:

 Which pca?

 2015-04-06 6:53 GMT-07:00 Steven Sagaert steven@gmail.com 
 javascript::

 does pca() center the input  output data or do you have to do that 
 yourself?

Re: [julia-users] does pca() center the data?

2015-04-06 Thread Steven Sagaert

thanks!

On Monday, April 6, 2015 at 6:43:54 PM UTC+2, Stefan Karpinski wrote:

 Looks like yes: 
 https://github.com/JuliaStats/MultivariateStats.jl/blob/master/src/pca.jl

 On Mon, Apr 6, 2015 at 12:27 PM, Steven Sagaert steven@gmail.com 
 javascript: wrote:

 I meant the one in MultivariateStats package

 On Monday, April 6, 2015 at 6:19:51 PM UTC+2, Andreas Noack wrote:

 There is no pca in Julia Base

 2015-04-06 9:16 GMT-07:00 Steven Sagaert steven@gmail.com:

 the one from the standard lib

 On Monday, April 6, 2015 at 4:01:00 PM UTC+2, Andreas Noack wrote:

 Which pca?

 2015-04-06 6:53 GMT-07:00 Steven Sagaert steven@gmail.com:

 does pca() center the input  output data or do you have to do that 
 yourself?

Re: [julia-users] does pca() center the data?

2015-04-06 Thread Steven Sagaert

I meant the one in MultivariateStats package

On Monday, April 6, 2015 at 6:19:51 PM UTC+2, Andreas Noack wrote:

 There is no pca in Julia Base

 2015-04-06 9:16 GMT-07:00 Steven Sagaert steven@gmail.com 
 javascript::

 the one from the standard lib

 On Monday, April 6, 2015 at 4:01:00 PM UTC+2, Andreas Noack wrote:

 Which pca?

 2015-04-06 6:53 GMT-07:00 Steven Sagaert steven@gmail.com:

 does pca() center the input  output data or do you have to do that 
 yourself?

Re: [julia-users] Why is Gadfly so slow when plotting it's first plot?

2015-03-29 Thread Steven Sagaert

Thanks for the info. I suspected it might have something to do with JIT 
compilation.

On Saturday, March 28, 2015 at 2:16:54 PM UTC+1, Isaiah wrote:

 This delay is due to parsing and JIT'ing a bunch of code in both Gadfly 
 and dependencies.

 There is a work-in-progress caching process you could try. See:
 https://github.com/dcjones/Gadfly.jl/issues/251#issuecomment-38626716

 and:
 http://docs.julialang.org/en/latest/devdocs/sysimg/

 The PR to make that simpler and module-specific is here:
 https://github.com/JuliaLang/julia/pull/8745

 On Sat, Mar 28, 2015 at 9:00 AM, Steven Sagaert steven@gmail.com 
 javascript: wrote:

 Hi,
 I use Gadfly to create simple barplots   save them as SVG. Since this is 
 for usage in a web page, I've only installed Gadfly, not extra backends. 
 Now when doing the first plot it is incredibly slow but much better on 
 subsequent plots. Why is that? Is there anything that can be done to speed 
 things up?

[julia-users] Why is Gadfly so slow when plotting it's first plot?

2015-03-28 Thread Steven Sagaert

Hi,
I use Gadfly to create simple barplots   save them as SVG. Since this is 
for usage in a web page, I've only installed Gadfly, not extra backends. 
Now when doing the first plot it is incredibly slow but much better on 
subsequent plots. Why is that? Is there anything that can be done to speed 
things up?

[julia-users] Re: Some simple use cases for multi-threading

2015-03-19 Thread Steven Sagaert

also the fork-join threadpool in Java.

On Wednesday, March 18, 2015 at 7:12:13 PM UTC+1, Sebastian Good wrote:

 Task stealing parallelism is an increasingly common use case and easy to 
 program. e.g. Cilk, Grand Central Dispatch, 

 On Thursday, March 12, 2015 at 11:52:37 PM UTC-4, Viral Shah wrote:

 I am looking to put together a set of use cases for our multi-threading 
 capabilities - mainly to push forward as well as a showcase. I am thinking 
 of starting with stuff in the microbenchmarks and the shootout 
 implementations that are already in test/perf. 

 I am looking for other ideas that would be of interest. If there is real 
 interest, we can collect all of these in a repo in JuliaParallel. 

 -viral

[julia-users] Re: Some simple use cases for multi-threading

2015-03-14 Thread Steven Sagaert

How about a multithreaded (+ coroutine as it is now) HttpServer?

On Friday, March 13, 2015 at 4:52:37 AM UTC+1, Viral Shah wrote:

 I am looking to put together a set of use cases for our multi-threading 
 capabilities - mainly to push forward as well as a showcase. I am thinking 
 of starting with stuff in the microbenchmarks and the shootout 
 implementations that are already in test/perf. 

 I am looking for other ideas that would be of interest. If there is real 
 interest, we can collect all of these in a repo in JuliaParallel. 

 -viral

[julia-users] Re: 3D interactive plots in IJulia

2015-03-04 Thread Steven Sagaert

Hi Simon,
The screenshots looks nice but I have to ask: why build a high performance 
native 2D/3D scientific plotting lib based on openGL from scratch when you 
could wrap mature native libs like VTK or Mayavi? I mean in R you have RGL 
 which is also directly based on openGL. You can use it for some basic 3D 
interative scatterplots  surfaceplots but personally I don't find these 
plots very satisfactory. I mean raw openGL itself is mainly for 3D 
graphics not scientific visualization and hence a whole lot more work is 
needed to add this functionality on top of it.

On Tuesday, March 3, 2015 at 10:43:27 PM UTC+1, Simon Danisch wrote:

 Well there is GLPlot.jl https://github.com/SimonDanisch/GLPlot.jl, but 
 it might not be what you're looking for...
 Also it's on a feature freeze right now.
 I'm restructuring the architecture, before I add more advanced features.

 Am Dienstag, 3. März 2015 16:38:32 UTC+1 schrieb Andrei Berceanu:

 Is there some Julia library that allows one to do create 3D (surface 
 plots) in the IJulia notebook and then rotate them interactively, using the 
 mouse?

 //A

[julia-users] Re: Private functions in the modules

2015-02-27 Thread Steven Sagaert

It's not a bug it's a fature ;)
I found this odd also when I was new to julia and complained about it. I 
wanted strict private visibillity like in C++/Java/C#, but the julia 
team does not want this. The only thing export does is that you can call 
the function without the module prefix.

On Friday, February 27, 2015 at 9:11:35 AM UTC+1, Devendra Ghate wrote:

 Hello, 
 Consider the following example from the julia manual page on `modules`. 

 ~~~ 
  module MyModule 

  export x, y 

  x() = x 
  y() = y 
  p() = p 

  end 
 ~~~ 

 Load the module by any method makes function `MyModule.p()` available 
 in the main workspace. 

 1. `using MyModule` 
 2. `using MyModule.x` 
 3. `import MyModule` 
 4. `import MyModule.x` 

 I would expect `p` (as a private function) to be not available for 
 execution outside module. Manual page also mentions that this should 
 be the case. 

 I am using Julia 0.3.3. May be I need to upgrade. 

 Cheers, 
 Devendra.

[julia-users] Re: movingpastasquishedcaseconvention?

2015-02-06 Thread Steven Sagaert

I prefer Java's camelcase: searchSortedLast: it's the same length as all
lower case but clearer.

On Thursday, February 5, 2015 at 8:12:43 PM UTC+1, David James wrote:

Hello,

The title of this post is Moving Past a Squished Case Convention not
Moving Pastas Quiche :)

The Julia standard library tends to use the squishedcase notation. Being
concise is great for mathematical functions, like sin, cos, and ln.
However, it is cognitively harder for people for compound function names;
e.g. searchsortedlast. Such a naming convention flies in the face of real
programming experience. It makes programming harder for people.

There are many sane ways to name functions. Lisps tend to use hyphens,
others often use underscores. R libraries use a non-standard mix [1].
Interestingly, the Julia parser code itself uses hyphens; e.g.
prec-assignment and prec-conditional:
https://github.com/JuliaLang/julia/blob/master/src/julia-parser.scm

It would be a shame for squishedcase to persist as the language reaches
1.0. What are some possible ways to address this problem without breaking
compatibility in the short-run?

I see a possible solution. Choose a character and encourage its use to
break apart words; e.g. -, _, or a middot (·) [2]. Make it highly
recommended but non-breaking until 1.0. Deprecate
functionsusingsquishedcase.

Julia is great overall but lacking in this way. Let's make it better.

Sincerely,
David

[1]
http://stackoverflow.com/questions/1944910/what-is-your-preferred-style-for-naming-variables-in-r

[2] The middot is relatively unobtrusive and doesn't take up much space
horizontally, e.g. search·sorted·last. It is also useful for variables
representing compound units; e.g. N·m.

[julia-users] LAPACKException(1) during SVD

2015-01-30 Thread Steven Sagaert

when doing an SVD of a large matrix I get
ERROR: LAPACKException(1)
 in gesdd! at linalg/lapack.jl:1046
 in svdfact! at linalg/factorization.jl:660
 in svdfact at linalg/factorization.jl:664

It's definitely something related to the data because it works on different 
matrices  the code has been working for months. Any idea what that error 
is about?

I'm working on Ubuntu 14.04, julia version 0.3.5

Re: [julia-users] Re: Almost at 500 packages!

2015-01-29 Thread Steven Sagaert

Since R basically has the same multiple dispatch I don't think this is what 
explains the difference with CRAN. I think the difference is that the julia 
repository is based on github which enables collaboration whereas CRAN is 
basically a file server without any collaboaration tools.

On Thursday, January 22, 2015 at 9:17:53 PM UTC+1, Stefan Karpinski wrote:

 I think that multiple dispatch helps here a lot – or really just the fact 
 that methods don't live *inside* of types like they do in class-based 
 single-dispatch o.o. languages. This lets one shared package define a bunch 
 of straightforward types and other packages can use those types without 
 having to agree exactly on what functions those shared types need to 
 support. Distributions is a great example – the definitions of distribution 
 types are pretty uncontroversial and those same types can be used for 
 anything from asking for PDF/CDF values, sampling, or for probabilistic 
 programming. I'm hopeful that this can help Julia to have a more cleanly 
 factored package ecosystem than other languages have been able to.

 On Thu, Jan 22, 2015 at 2:00 PM, Jacob Quinn quinn@gmail.com 
 javascript: wrote:

 Great points Ista.

 I think the main motivation in emulating CRAN is with respect to overall 
 volume of functionality, but your points about too many overlapping or 
 half-baked packages strike home as a former (and still occasional) R user. 
 I think the efforts in the Julia package repository, METADATA, to encourage 
 collaboration and merging packages are the most important and useful in 
 managing the package ecosystem.

 As a personal anecdote, I initially created the SQLite.jl package closely 
 following the sqldf R package. After an initial sprint, I had other 
 priorities take over and the package languished a bit. Once I found more 
 time (and need!), I started a redesign of the package from the ground up to 
 be more Julian. Coincidently, at the same time, I came across a reddit 
 thread where Sean Marshallsay had published a post about redesigning the 
 SQLite package to be more Julian as well. I pinged him on reddit to see if 
 he'd be willing to combine efforts on redesigning the package with me and 
 he's contributed some of the most useful functionality now to date (being 
 able to register and run custom julia scalar and aggregate functions in 
 SQLite). I think in another world (or another programming language), I 
 would have seen his post and thought, Oh well, someone else is creating a 
 competing package. But having grown with the Julia community for a couple 
 years now, I thought I'd reach out about collaborating, and it's certainly 
 worked out quite well for everyone, IMO.

 Anyway, TL;DR the collaborative nature of the package community is one of 
 my all-time favorite aspects of Julia.

 -Jacob

 On Thu, Jan 22, 2015 at 11:49 AM, Ista Zahn ista...@gmail.com 
 javascript: wrote:

 As an R user I'm surprised to see CRAN held up as a model to aspire
 to. There is a _lot_ of overlapping functionality among those 6k
 packages, making it hard to figure out which one is best for a
 particular purpose. There are also a lot of unfocused packages
 providing miscellaneous collections of functions, which makes it
 difficult to understand exactly what the package offers you as a user.
 As a user things are easier if a) each package has a clearly defined
 scope (i.e., does one thing well), and b) there are not too many
 similar packages to choose from for any particular task. None of this
 is to say that julia isn't on the right track in terms of packages,
 just that I question the wisdom of emulating CRAN in this regard.

 Best,
 Ista

 On Wed, Jan 21, 2015 at 7:46 PM, Iain Dunning iaind...@gmail.com 
 javascript: wrote:
  Yes indeed Christoph, a package that doesn't work is a package that 
 might as
  well not exist. Fortunately, and fairly uniquely I think, we can 
 quantify to
  some extent how many of our packages are working, and the degree to 
 which
  they are.
 
  In my mind the goal now is grow fast and don't break too many 
 things, and
  I think our pace over the last month or so of around 1 package per day 
 is
  fantastic, with good stability of packages (i.e. they pass tests). 
 I've also
  noticed that packages being registered now are often of a higher 
 quality
  than they used to be, in terms of tests and documentation. I talked 
 about
  this a bit at JuliaCon, but in some sense NPM and CRAN represent 
 different
  ends of a spectrum of possibilities, and it seems like the consensus 
 is more
  towards CRAN. So, we're doing good I think.
 
 
  On Wed, Jan 21, 2015 at 7:02 PM, Kevin Squire kevin@gmail.com 
 javascript:
  wrote:
 
  Additional references: PyPI lists 54212 packages, currently (roughly 
 half
  as many as node) but, CRAN only has 6214.
 
  Cheers,
 Kevin
 
  On Wed, Jan 21, 2015 at 3:37 PM, Sean Garborg sean.g...@gmail.com 
 javascript:
  wrote:
 
  You wouldn't like node ;)
 
  On

[julia-users] Re: Almost at 500 packages!

2015-01-29 Thread Steven Sagaert

A growing ecosystem is great but let's not fall into the trap of bigger is 
better. CPAN ( CRAN which is modeled after it) is/was huge but that 
hasn't prevented the long decline of Perl. Sometimes less is more, 
meaning: I'd rather have a smaller number of  high quality larger 
packages/frameworks with little overlap between them rather than a zoo 
like CRAN. So far this seems to be working out quite well for julia but 
that may be because it's still very early. Once the  user base grows will 
the packages stay so focussed or will we see more duplication/overlap? 
Let's hope for the first.

On Tuesday, January 20, 2015 at 4:32:45 PM UTC+1, Iain Dunning wrote:

 Just noticed on http://pkg.julialang.org/pulse.html that we are at 499 
 registered packages with at least one version tagged that are Julia 0.4-dev 
 compatible (493 on Julia 0.3).

 Thanks to all the package developers for their efforts in growing the 
 Julia package ecosystem!

Re: [julia-users] Re: Almost at 500 packages!

2015-01-28 Thread Steven Sagaert

I couldn't agree more. Personally I find CRAN to be a mess. There's no 
organization to it. You can only find something in there by googling. Also 
the documentation of R packages is very spartan...

On Thursday, January 22, 2015 at 7:49:40 PM UTC+1, Ista Zahn wrote:

 As an R user I'm surprised to see CRAN held up as a model to aspire 
 to. There is a _lot_ of overlapping functionality among those 6k 
 packages, making it hard to figure out which one is best for a 
 particular purpose. There are also a lot of unfocused packages 
 providing miscellaneous collections of functions, which makes it 
 difficult to understand exactly what the package offers you as a user. 
 As a user things are easier if a) each package has a clearly defined 
 scope (i.e., does one thing well), and b) there are not too many 
 similar packages to choose from for any particular task. None of this 
 is to say that julia isn't on the right track in terms of packages, 
 just that I question the wisdom of emulating CRAN in this regard. 

 Best, 
 Ista 

 On Wed, Jan 21, 2015 at 7:46 PM, Iain Dunning iaind...@gmail.com 
 javascript: wrote: 
  Yes indeed Christoph, a package that doesn't work is a package that 
 might as 
  well not exist. Fortunately, and fairly uniquely I think, we can 
 quantify to 
  some extent how many of our packages are working, and the degree to 
 which 
  they are. 
  
  In my mind the goal now is grow fast and don't break too many things, 
 and 
  I think our pace over the last month or so of around 1 package per day 
 is 
  fantastic, with good stability of packages (i.e. they pass tests). I've 
 also 
  noticed that packages being registered now are often of a higher quality 
  than they used to be, in terms of tests and documentation. I talked 
 about 
  this a bit at JuliaCon, but in some sense NPM and CRAN represent 
 different 
  ends of a spectrum of possibilities, and it seems like the consensus is 
 more 
  towards CRAN. So, we're doing good I think. 
  
  
  On Wed, Jan 21, 2015 at 7:02 PM, Kevin Squire kevin@gmail.com 
 javascript: 
  wrote: 
  
  Additional references: PyPI lists 54212 packages, currently (roughly 
 half 
  as many as node) but, CRAN only has 6214. 
  
  Cheers, 
 Kevin 
  
  On Wed, Jan 21, 2015 at 3:37 PM, Sean Garborg sean.g...@gmail.com 
 javascript: 
  wrote: 
  
  You wouldn't like node ;) 
  
  On Wednesday, January 21, 2015 at 4:29:53 PM UTC-7, Christoph Ortner 
  wrote: 
  
  Great that so many are contributing to Julia, but I would question 
  whether such a large number of packages will be healthy in the long 
 run. It 
  will make it very difficult for new users to use Julia effectively. 
  
  
  
  
  
  -- 
  Iain Dunning 
  PhD Candidate / MIT Operations Research Center 
  http://iaindunning.com  /  http://juliaopt.org

Re: [julia-users] Re: Control system library for Julia?

2014-09-28 Thread Steven Sagaert

GC will always be non-deterministic. For hard real time you just need to 
manage memory yourself. That's the approach used by real time 
Java http://www.rtsj.org/

On Monday, September 15, 2014 10:25:07 AM UTC+2, Uwe Fechner wrote:

 Hi,
 I am working an airborne wind energy as well. I wrote a kite-power system 
 simulator in Python, where also one of the controllers (the winch 
 controller) is 
 implemented in Python. ( https://bitbucket.org/ufechner/freekitesim )

 With Python you can reach a jitter of less than 2 ms in the 20Hz control 
 loop quite easily (on low-latency Linux). In my case this is sufficient for 
 prototyping,
 but the real flight control system should run at a higher update frequency 
 (perhaps 200 Hz).

 In contrast to Julia Python is using reference counting, and in my Python 
 applications I just turn off the garbage collection.

 For Julia (and I would love to rewrite the simulator in Julia) this is 
 probably not an option. A better garbage collector 
 (which is in the pipeline, see: 
 https://github.com/JuliaLang/julia/pull/5227 ) would definitely help. 

 Generating embedded controllers in LLVM IR would be great!

 Best regards:

 Uwe

 On Monday, September 15, 2014 8:32:02 AM UTC+2, Andrew Wagner wrote:

 Hi Spencer!

 My job in airborne wind energy is ending soon so I don't have a specific 
 application (aside from control), but I would want to stay sub-ms for 
 anything in-process.  I have been using Orocos extensively for the last few 
 years.  It's the best control middleware in the open source world, but I 
 think a lot of things could be improved if it was re-implemented in a 
 language with a better typesystem and introspection... one example would be 
 that adding a new type to the system requires quite a bit of boilerplate 
 code, creating an incentive for users to just pass data in flat arrays, 
 subverting type safety.  

 Cheers,
 Andrew

 On Mon, Sep 15, 2014 at 7:03 AM, Spencer Russell spencer@gmail.com 
 wrote:

 Hi Andrew,

 What are your realtime deadlines? I'm working on live audio processing 
 stuff with Julia, where I'd like to get the audio latency down into a few 
 ms. Julia definitely isn't there yet (and might never get true 
 hard-realtime), but there's some promising work being done on the GC to 
 reduce pause time for lower-latency applications. It's also helpful to 
 profile the code to reduce allocations (and the need for GC) down to a 
 minimum. I haven't yet gotten down to zero-allocation code in my render 
 loop, but once I got it down below 100 bytes I moved on to other more 
 pressing features. At some point I'll dig deeper to see if I can get rid of 
 the last few allocations.

 I'd definitely be happy if there are some more folks out there driving 
 demand for lower-latency Julia. :)

 peace,
 s
  
 On Sun, Sep 14, 2014 at 3:58 PM, Andrew Wagner drew...@gmail.com 
 wrote:

 Hello again Uwe!  

 It's fun running into someone I know on a language geek forum :)  I'm 
 helping one of our bachelor's students implement an LQR controller on our 
 carousel in Freiburg.  It's an ugly hack, but I'm calling an octave script 
 to recompute the feedback gains online.  Octave wraps slicot, so if the 
 licenses are compatible, perhaps wrapping slicot is the way to go for some 
 functions, if the licenses are compatible.

 Personally, I have a burning desire for a better language we can 
 actually do control in (rust?).  I doubt Julia qualifies due to the 
 garbage 
 collection, but does anyone know if Julia has some sort of way to JIT 
 Julia 
 expressions to code that does ~not have any garbage collection?  If so, is 
 there a way to export them as object files and link against them from C? 
  Then you'd still have to write glue code in a systems language, but at 
 least the implementation of the controller wouldn't have to cross a 
 language boundary...  

 Cheers,
 Andrew

 On Thursday, February 20, 2014 10:56:20 PM UTC+1, Uwe Fechner wrote:

 Hello,

 I could not find any control system library for Julia yet. Would that 
 make sense?
 There is a control system library available for Python:
 http://www.cds.caltech.edu/~murray/wiki/index.php/Python-control

 Perhaps this could be used as starting point? I think that 
 implementing this in Julia
 should be easier and faster than in Python.

 Any comments?
 Should I open a feature request?

 Uwe Fechner, TU Delft, The Netherlands

Re: [julia-users] Are dataframes indexed?

2014-09-08 Thread Steven Sagaert

Hi John,
I didn't know this but I wasn't knocking on Hive (which is just fine). The 
problem as far as ML is concerned isn't Hive but Hadoop which isn't very 
suited for it.

Sincerely,
Steven 

On Monday, September 8, 2014 1:52:15 AM UTC+2, John Myles White wrote:

 I kind of suspect my team (which is the team that invented Hive) isn't 
 likely to stop using Hive anytime soon.

  -- John

 On Sep 7, 2014, at 4:50 PM, Steven Sagaert steven@gmail.com 
 javascript: wrote:



 On Monday, September 8, 2014 1:37:50 AM UTC+2, John Myles White wrote:

 Well, you can write an interface to sqlite to generate in-memory DB's. 
 The only restriction is that you won't get some of the semantics you might 
 want relative to DataFrames, which allow entries of all types.

 Personally, I'm much less interested in Spark and SciDB and much more 
 interested in Hive.


 Well Spark has a Hive clone called Shark which can run HiveQL queries. 
 It's just a lot faster ;) But they are moving more towards  a general SQL 
 system called Spark SQL and will reimplement Shark also on top of that in 
 the future.
  

 Blaze's approach is very interesting.

 I agree. 


  -- John

 On Sep 7, 2014, at 4:32 PM, Steven Sagaert steven@gmail.com wrote:



 On Sunday, September 7, 2014 7:28:18 PM UTC+2, Harlan Harris wrote:

 This was a feature that sorta existed for a while (see 
 https://github.com/JuliaStats/DataFrames.jl/issues/24 ), but nobody was 
 very happy with it, and I think John ripped it out as part of one of his 
 simplification passes. It's tricky to think about how best to implement 
 this sort of feature when you aspirationally want to support memory-mapped 
 and distributed structures too,

 I was more thinking along the lines of a simple in-memory db. If you want 
 out-of-memory  distributed it's probably best to interface systems like 
 Spark SQL or Scidb rather than develop that yourselves from scratch. Maybe 
 write something in the spirit of Blaze (blaze.pydata.org)? Right now 
 Blaze supports Spark but I was just discussing with them about scidb and 
 they are also looking into that.
  

 and where you want a semantics that's explicitly set-like, cf Pandas or 
 R's data.tables. 

 R's data.table is nice but unfortunately only supports just one index. 


 Also worth thinking about this in the context of John's just-announced 
 goals: https://gist.github.com/johnmyleswhite/ad5305ecaa9de01e317e



 On Sun, Sep 7, 2014 at 12:54 PM, John Myles White johnmyl...@gmail.com 
 wrote:

 No, DataFrames are not indexed. For now, you’d need to build a wrapper 
 that indexes a DataFrame to get that kind of functionality.

  — John

 On Sep 7, 2014, at 9:53 AM, Steven Sagaert steven@gmail.com 
 wrote:

  Hi,
  I was wondering if searching in a dataframe is indexed (in the DB 
 sense, not array sense. e.g. a tree index structure) or not? If so can you 
 have multiple indices (on multiple columns) or not?

[julia-users] Are dataframes indexed?

2014-09-07 Thread Steven Sagaert

Hi,
I was wondering if searching in a dataframe is indexed (in the DB sense, 
not array sense. e.g. a tree index structure) or not? If so can you have 
multiple indices (on multiple columns) or not?

[julia-users] when you do push! on an array does it get copied each time or in chunks?

2014-09-07 Thread Steven Sagaert

When you start with an empty array and grow it one element at a time with 
push!, does the underlying array memory block get copied  expanded by one 
or in larger chunks (like ArrayList in Java)?

Re: [julia-users] Are dataframes indexed?

2014-09-07 Thread Steven Sagaert

As far as Spark/Spark SQL is concerned: there is now a  very nice high 
level API that's in the process of being open sourced called distributed 
dataframe http://ddf.io/. It's java based but also has R  Python 
interfaces. You could wrap that via JavaCall or via PyCall.

On Monday, September 8, 2014 1:32:28 AM UTC+2, Steven Sagaert wrote:



 On Sunday, September 7, 2014 7:28:18 PM UTC+2, Harlan Harris wrote:

 This was a feature that sorta existed for a while (see 
 https://github.com/JuliaStats/DataFrames.jl/issues/24 ), but nobody was 
 very happy with it, and I think John ripped it out as part of one of his 
 simplification passes. It's tricky to think about how best to implement 
 this sort of feature when you aspirationally want to support memory-mapped 
 and distributed structures too,

 I was more thinking along the lines of a simple in-memory db. If you want 
 out-of-memory  distributed it's probably best to interface systems like 
 Spark SQL or Scidb rather than develop that yourselves from scratch. Maybe 
 write something in the spirit of Blaze (blaze.pydata.org)? Right now 
 Blaze supports Spark but I was just discussing with them about scidb and 
 they are also looking into that.
  

 and where you want a semantics that's explicitly set-like, cf Pandas or 
 R's data.tables. 

 R's data.table is nice but unfortunately only supports just one index. 


 Also worth thinking about this in the context of John's just-announced 
 goals: https://gist.github.com/johnmyleswhite/ad5305ecaa9de01e317e



 On Sun, Sep 7, 2014 at 12:54 PM, John Myles White johnmyl...@gmail.com 
 wrote:

 No, DataFrames are not indexed. For now, you’d need to build a wrapper 
 that indexes a DataFrame to get that kind of functionality.

  — John

 On Sep 7, 2014, at 9:53 AM, Steven Sagaert steven@gmail.com wrote:

  Hi,
  I was wondering if searching in a dataframe is indexed (in the DB 
 sense, not array sense. e.g. a tree index structure) or not? If so can you 
 have multiple indices (on multiple columns) or not?

Re: [julia-users] Are dataframes indexed?

2014-09-07 Thread Steven Sagaert



On Monday, September 8, 2014 1:37:50 AM UTC+2, John Myles White wrote:

 Well, you can write an interface to sqlite to generate in-memory DB's. The 
 only restriction is that you won't get some of the semantics you might want 
 relative to DataFrames, which allow entries of all types.

 Personally, I'm much less interested in Spark and SciDB and much more 
 interested in Hive.


Well Spark has a Hive clone called Shark which can run HiveQL queries. It's 
just a lot faster ;) But they are moving more towards  a general SQL system 
called Spark SQL and will reimplement Shark also on top of that in the 
future.
 

 Blaze's approach is very interesting.

I agree. 


  -- John

 On Sep 7, 2014, at 4:32 PM, Steven Sagaert steven@gmail.com 
 javascript: wrote:



 On Sunday, September 7, 2014 7:28:18 PM UTC+2, Harlan Harris wrote:

 This was a feature that sorta existed for a while (see 
 https://github.com/JuliaStats/DataFrames.jl/issues/24 ), but nobody was 
 very happy with it, and I think John ripped it out as part of one of his 
 simplification passes. It's tricky to think about how best to implement 
 this sort of feature when you aspirationally want to support memory-mapped 
 and distributed structures too,

 I was more thinking along the lines of a simple in-memory db. If you want 
 out-of-memory  distributed it's probably best to interface systems like 
 Spark SQL or Scidb rather than develop that yourselves from scratch. Maybe 
 write something in the spirit of Blaze (blaze.pydata.org)? Right now 
 Blaze supports Spark but I was just discussing with them about scidb and 
 they are also looking into that.
  

 and where you want a semantics that's explicitly set-like, cf Pandas or 
 R's data.tables. 

 R's data.table is nice but unfortunately only supports just one index. 


 Also worth thinking about this in the context of John's just-announced 
 goals: https://gist.github.com/johnmyleswhite/ad5305ecaa9de01e317e



 On Sun, Sep 7, 2014 at 12:54 PM, John Myles White johnmyl...@gmail.com 
 wrote:

 No, DataFrames are not indexed. For now, you’d need to build a wrapper 
 that indexes a DataFrame to get that kind of functionality.

  — John

 On Sep 7, 2014, at 9:53 AM, Steven Sagaert steven@gmail.com wrote:

  Hi,
  I was wondering if searching in a dataframe is indexed (in the DB 
 sense, not array sense. e.g. a tree index structure) or not? If so can you 
 have multiple indices (on multiple columns) or not?

[julia-users] Re: Announcement: Playground.jl

2014-08-25 Thread Steven Sagaert

Nice! This will definitely be useful for playing with different versions.

On Saturday, August 23, 2014 10:01:45 PM UTC+2, Rory Finnegan wrote:

 Hi everyone,

 I've published my Playground.jl 
 https://github.com/Rory-Finnegan/Playground.jl package to create julia 
 sandboxes like python virtual environments, if anyone wants to give it a 
 try.  So far I've tested it on Funtoo and Linux Mint, but I'm looking for 
 people to try it out on other platforms (like Windows and OSX).

 Cheers,
 Rory

[julia-users] Re: problem after upgrading to v0.3.0

2014-08-25 Thread Steven Sagaert

Hi Tobias,
Thanks. I had lines like #= blabla  and those were the 
problem.

On Monday, August 25, 2014 4:21:43 PM UTC+2, Steven Sagaert wrote:

 when running a file non-interactively I get:

  julia VMRecommender.jl
 ERROR: syntax: incomplete: unterminated multi-line comment #= ... =#
  in include at ./boot.jl:245
  in include_from_node1 at loading.jl:128
  in process_options at ./client.jl:285
  in _start at ./client.jl:354
  in _start_3B_1716 at /usr/bin/../lib/x86_64-linux-gnu/julia/sys.so


 any idea?

74 matches

Mail list logo