date:20151015

Re: OT: why do people use python when it is slow?

2015-10-15 Thread Chris via Digitalmars-d-learn

On Thursday, 15 October 2015 at 09:47:56 UTC, Ola Fosheim Grøstad 
wrote:

On Thursday, 15 October 2015 at 09:24:52 UTC, Chris wrote:
Yep. This occurred to me too. Sorry Ola, but I think you don't 
know how sausages are made.


I most certainly do. I am both doing backend programming and we 
have a farm... :-)


Well, you know how gourmet sausages are made (100% meat), because 
you make them yourself apparently. But I was talking about the 
sausages you get out there ;) A lot of websites are not 
"planned". They are quickly put together to promote an idea. The 
code/architecture is not important at that stage. The idea is 
important. The website has to have dynamic content that can be 
edited by non-programmers (Not even PHP! HTML at most!). If you 
designed a website from a programming point of view first, you'd 
never get the idea out in time.

Re: OT: why do people use python when it is slow?

2015-10-15 Thread Russel Winder via Digitalmars-d-learn

On Thu, 2015-10-15 at 10:00 +, Chris via Digitalmars-d-learn wrote:
> 
[…]
> Well, you know how gourmet sausages are made (100% meat), because 
> you make them yourself apparently. But I was talking about the 
> sausages you get out there ;) A lot of websites are not 
> "planned". They are quickly put together to promote an idea. The 
> code/architecture is not important at that stage. The idea is 
> important. The website has to have dynamic content that can be 
> edited by non-programmers (Not even PHP! HTML at most!). If you 
> designed a website from a programming point of view first, you'd 
> never get the idea out in time.#

And most commercial websites selling things are truly appalling: slow
performance, atrocious usability/UX. Who cares if the site is
brilliantly tuned if it is unusable?


-- 
Russel.
=
Dr Russel Winder  t: +44 20 7585 2200   voip: sip:russel.win...@ekiga.net
41 Buckmaster Roadm: +44 7770 465 077   xmpp: rus...@winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder



signature.asc
Description: This is a digitally signed message part

Building and Running Unittests for a Specific Phobos Package Only

2015-10-15 Thread Nordlöw via Digitalmars-d-learn

Is there a Make-target for building and running the unittests for 
a specific Phobos package, say `std.range`, only?

Re: OT: why do people use python when it is slow?

2015-10-15 Thread Russel Winder via Digitalmars-d-learn

On Thu, 2015-10-15 at 09:35 +, Ola Fosheim Grøstad via Digitalmars-
d-learn wrote:
> On Thursday, 15 October 2015 at 07:57:51 UTC, Russel Winder wrote:
> > lot better than it could be. From small experiments D is (and 
> > also Chapel is even more) hugely faster than Python/NumPy at 
> > things Python people think NumPy is brilliant for. Expectations
> 
> Have you had a chance to look at PyOpenCL and PYCUDA?

Yes. 

CUDA is of course doomed in the long run as Intel put GPGPU on the
processor chip. OpenCL will eventually be replaced with Vulkan
(assuming they can get the chips made).

-- 
Russel.
=
Dr Russel Winder  t: +44 20 7585 2200   voip: sip:russel.win...@ekiga.net
41 Buckmaster Roadm: +44 7770 465 077   xmpp: rus...@winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder



signature.asc
Description: This is a digitally signed message part

Re: OT: why do people use python when it is slow?

2015-10-15 Thread Ola Fosheim Grøstad via Digitalmars-d-learn


On Thursday, 15 October 2015 at 09:24:52 UTC, Chris wrote:
Yep. This occurred to me too. Sorry Ola, but I think you don't 
know how sausages are made.


I most certainly do. I am both doing backend programming and we 
have a farm... :-)


Do you really think that all the websites out there are 
performance tuned by network programming specialists? You'd be 
surprised!


If they are to scale, then they have to pick algorithms and 
architectures that scale. This is commodity nowadays. You want to 
get as close to O(1) as possible for requests. This is how you 
build scalable systems. No point in having 1ms response time 
under low load and 1ms response time when the incoming link 
is saturated.


You'd rather have 100ms response under low load and 120ms 
response time when saturated + 99.% availability/uptime.


Robustness and scaling costs latency, but you want acceptable and 
stable QoS, not brilliant QoS under low load and horrible QoS 
under high load.


Scalable websites aren't designed like sportcars, they are 
designed like trains.

Re: OT: why do people use python when it is slow?

2015-10-15 Thread Russel Winder via Digitalmars-d-learn

On Thu, 2015-10-15 at 06:48 +, data pulverizer via Digitalmars-d-
learn wrote:
> 
[…]
> A journey of a thousand miles ...

Exactly.

> I tried to start creating a data table type object by 
> investigating variantArray: 
> http://forum.dlang.org/thread/hhzavwrkbrkjzfohc...@forum.dlang.org
>  but hit the snag that D is a static programming language and may not
> allow the kind of behaviour you need for creating the same kind of
> behaviour you need in data table - like objects.
> 
> I envisage such an object as being composed of arrays of vectors 
> where each vector represents a column in a table as in R - easier 
> for model matrix creation. Some people believe that you should 
> work with arrays of tuple rows - which may be more big data 
> friendly. I am not overly wedded to either approach.
> 
> Anyway it seems I have hit an inherent limitation in the 
> language. Correct me if I am wrong. The data frame needs to have 
> dynamic behaviour bind rows and columns and return parts of 
> itself as a data table etc and since D is a static language we 
> cannot do this.

Just because D doesn't have this now doesn't mean it cannot. C doesn't
have such capability but R and Python do even though R and CPython are
just C codes.

Pandas data structures rely on the NumPy n-dimensional array
implementation, it is not beyond the bounds of possibility that that
data structure could be realized as a D module.

Is R's data.table written in R or in C? In either case, it is not
beyond the bounds of possibility that that data structure could be
realized as a D module.

The core issue is to have a seriously efficient n-dimensional array
that is amenable to data parallelism and is extensible. As far as I am
aware currently (I will investigate more) the NumPy array is a good
native code array, but has some issues with data parallelism and Pandas
has to do quite a lot of work to get the extensibility. I wonder how
the R data.table works.

I have this nagging feeling that like NumPy, data.table seems a lot
better than it could be. From small experiments D is (and also Chapel
is even more) hugely faster than Python/NumPy at things Python people
think NumPy is brilliant for. Expectations of Python programmers are
set by the scale of Python performance, so NumPy seems brilliant.
Compared to the scale set by D and Chapel, NumPy is very disappointing.
I bet the same is true of R (I have never really used R).

This is therefore an opportunity for D to step in. However it is a
journey of a thousand miles to get something production worthy.
Python/NumPy/Pandas have had a very large number of programmer hours
expended on them.  Doing this poorly as a D modules is likely worse
than not doing it at all.

-- 
Russel.
=
Dr Russel Winder  t: +44 20 7585 2200   voip: sip:russel.win...@ekiga.net
41 Buckmaster Roadm: +44 7770 465 077   xmpp: rus...@winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

signature.asc
Description: This is a digitally signed message part

Re: OT: why do people use python when it is slow?

2015-10-15 Thread Chris via Digitalmars-d-learn

On Wednesday, 14 October 2015 at 18:17:29 UTC, Russel Winder 
wrote:




The thing about Python is NumPy, SciPy, Pandas, Matplotlib, 
IPython, Jupyter, GNU Radio. The data science, bioinformatics, 
quant, signal provessing, etc. people do not give a sh!t which 
language they used, what they want is to get their results as 
fast as possible. Most of them do not write programs that are 
to last, they are effectively throw away programs. This leads 
them to Python (or R) and they are not really interested in 
learning anything else.




Scary, but I agree with you again. In science this is exactly 
what usually happens. Throw away programs, a list here, a loop 
there, clumsy, inefficient code. And that's fine, in a way that's 
what scripting is for. The problems start to kick in when the 
same guys get the idea to go public and write a program that 
everyone can use. Then you have a mess of slow code 
(undocumented) in a slow language. This is why I always say "Use 
C, C++ or D from the very beginning" or at least document your 
code in a way that it can easily be rewritten in D or C. But 
well, you know, results, papers, conferences ... This is why many 
innovations live in an eternal Matlab or Python limbo.

Re: OT: why do people use python when it is slow?

2015-10-15 Thread data pulverizer via Digitalmars-d-learn


On Thursday, 15 October 2015 at 02:20:42 UTC, jmh530 wrote:
On Wednesday, 14 October 2015 at 22:11:56 UTC, data pulverizer 
wrote:
On Tuesday, 13 October 2015 at 23:26:14 UTC, Laeeth Isharc 
wrote:

https://www.quora.com/Why-is-Python-so-popular-despite-being-so-slow
Andrei suggested posting more widely.



I believe it is easier and more effective to start on the 
research side. D will need: [snip]


Great list, but tons of work!


A journey of a thousand miles ...

I tried to start creating a data table type object by 
investigating variantArray: 
http://forum.dlang.org/thread/hhzavwrkbrkjzfohc...@forum.dlang.org but hit the snag that D is a static programming language and may not allow the kind of behaviour you need for creating the same kind of behaviour you need in data table - like objects.


I envisage such an object as being composed of arrays of vectors 
where each vector represents a column in a table as in R - easier 
for model matrix creation. Some people believe that you should 
work with arrays of tuple rows - which may be more big data 
friendly. I am not overly wedded to either approach.


Anyway it seems I have hit an inherent limitation in the 
language. Correct me if I am wrong. The data frame needs to have 
dynamic behaviour bind rows and columns and return parts of 
itself as a data table etc and since D is a static language we 
cannot do this.

Re: OT: why do people use python when it is slow?

2015-10-15 Thread Chris via Digitalmars-d-learn


On Wednesday, 14 October 2015 at 18:37:40 UTC, Mengu wrote:
On Wednesday, 14 October 2015 at 05:42:12 UTC, Ola Fosheim 
Grøstad wrote:
On Tuesday, 13 October 2015 at 23:26:14 UTC, Laeeth Isharc 
wrote:

https://www.quora.com/Why-is-Python-so-popular-despite-being-so-slow
Andrei suggested posting more widely.


That's flaimbait:

«Many really popular websites use Python. But why is that? 
Doesn't it affect the performance of the website?»


No. Really popular websites use pre-generated content / front 
end caches / CDNs or wait for network traffic from distributed 
databases.


really popular portals, news sites? yes. really popular 
websites? nope. like booking.com, airbnb.com, reddit.com are 
popular websites that have many parts which have to be dynamic 
and responsive as hell and they cannot use caching, 
pre-generated content, etc.


using python affect the performance of your website. if you 
were to use ruby or php your web app would be slower than it's 
python version. and python version would be slower than go or d 
version.


Yep. This occurred to me too. Sorry Ola, but I think you don't 
know how sausages are made. Do you really think that all the 
websites out there are performance tuned by network programming 
specialists? You'd be surprised!

Re: OT: why do people use python when it is slow?

2015-10-15 Thread Ola Fosheim Grøstad via Digitalmars-d-learn


On Thursday, 15 October 2015 at 07:57:51 UTC, Russel Winder wrote:
lot better than it could be. From small experiments D is (and 
also Chapel is even more) hugely faster than Python/NumPy at 
things Python people think NumPy is brilliant for. Expectations


Have you had a chance to look at PyOpenCL and PYCUDA?

Re: Why isn't global operator overloading allowed in D?

2015-10-15 Thread Shriramana Sharma via Digitalmars-d-learn

John Colvin wrote:

> On Wednesday, 14 October 2015 at 15:02:02 UTC, Shriramana Sharma
> wrote:
> What binary arithmetic operators do you need that real[] doesn't
> already support?

OMG silly me! I can already do a[] /= b[]... D is great! :-D Thanks a lot!

Re: OT: why do people use python when it is slow?

2015-10-15 Thread Ola Fosheim Grøstad via Digitalmars-d-learn


On Thursday, 15 October 2015 at 10:00:21 UTC, Chris wrote:
about the sausages you get out there ;) A lot of websites are 
not "planned". They are quickly put together to promote an idea.


They are WordPress sites... :-(

If you designed a website from a programming point of view 
first, you'd never get the idea out in time.


It's not that bad, but modelling data for nosql databases is a 
bigger challenge than getting decent performance from the code.


There is another issue with using languages like Rust/C++/D and 
that is: if it crashes you loose all the concurrent requests, 
perhaps even without a reasonable log trace. What I'd want for 
handling requests is something less fragile where only the single 
request that went bad crash out. Pure Python and Java provide 
this property.

Re: Class, constructor and inherance.

2015-10-15 Thread holo via Digitalmars-d-learn

Thank you for example. I asked about it programmers at work too - 
PHP guys - and they explained me how you are see usage of that 
interfaces in my code. They prepare for me some "skeleton" on 
which i will try to build my solution. Will be back  if i will 
have some code.

Re: Why isn't global operator overloading allowed in D?

2015-10-15 Thread John Colvin via Digitalmars-d-learn

On Thursday, 15 October 2015 at 15:45:00 UTC, Shriramana Sharma 
wrote:

John Colvin wrote:

On Wednesday, 14 October 2015 at 15:02:02 UTC, Shriramana 
Sharma

wrote:
What binary arithmetic operators do you need that real[] 
doesn't

already support?


OMG silly me! I can already do a[] /= b[]... D is great! :-D 
Thanks a lot!


Also:

a[] = b[] + c[] * d[] - 42;

and so on... All that's required is that there is pre-allocated 
memory for the result to go in i.e. there has to be enough space 
in a[].


You should be aware that with DMD these array operations should 
be much faster than a straightforward loop, as they are done in 
handwritten asm using vector instructions. Be wary of using them 
on very small arrays, there is some overhead.


With LDC/GDC you probably wont see much difference either way, if 
I remember correctly they rely on the optimiser instead.

Re: Building and Running Unittests for a Specific Phobos Package Only

2015-10-15 Thread Marc Schütz via Digitalmars-d-learn


On Thursday, 15 October 2015 at 10:07:29 UTC, Nordlöw wrote:
Is there a Make-target for building and running the unittests 
for a specific Phobos package, say `std.range`, only?


make -f posix.mak std/range.test

Re: OT: why do people use python when it is slow?

2015-10-15 Thread data pulverizer via Digitalmars-d-learn


On Thursday, 15 October 2015 at 07:57:51 UTC, Russel Winder wrote:
On Thu, 2015-10-15 at 06:48 +, data pulverizer via 
Digitalmars-d- learn wrote:
Just because D doesn't have this now doesn't mean it cannot. C 
doesn't have such capability but R and Python do even though R 
and CPython are just C codes.


I think the way R does this is that its dynamic runtime 
environment is used bind together native C basic type arrays. I 
wander if we could simulate dynamic behaviour by leveraging D's 
short compilation time to dynamically write/update data table 
source file(s) containing the structure of new/modified data 
tables?


Pandas data structures rely on the NumPy n-dimensional array 
implementation, it is not beyond the bounds of possibility that 
that data structure could be realized as a D module.


Julia's DArray object is an interested take on this: 
https://github.com/JuliaParallel/DistributedArrays.jl


I believe that parallelism on arrays and data tables are 
different challenges. Data tables are easier since we can 
parallelise by row, thus the preference of having row-based 
tuples.


The core issue is to have a seriously efficient n-dimensional 
array that is amenable to data parallelism and is extensible. 
As far as I am aware currently (I will investigate more) the 
NumPy array is a good native code array, but has some issues 
with data parallelism and Pandas has to do quite a lot of work 
to get the extensibility. I wonder how the R data.table works.


R's data table is not currently parallelised

I have this nagging feeling that like NumPy, data.table seems a 
lot better than it could be. From small experiments D is (and 
also Chapel is even more) hugely faster than Python/NumPy at 
things Python people think NumPy is brilliant for. Expectations 
of Python programmers are set by the scale of Python 
performance, so NumPy seems brilliant. Compared to the scale 
set by D and Chapel, NumPy is very disappointing. I bet the 
same is true of R (I have never really used R).


Thanks for notifying me about Chapel - something else interesting 
to investigate. When it comes to speed R is very strange. Basic 
math (e.g. *, +, /) operation on an R array can be fast but 
for-looping will kill speed by hundreds of times - most things 
are slow in R unless they are directly baked into its base 
operations. You can write code in C and C++ can call it very 
easily in R though using its Rcpp interface.



This is therefore an opportunity for D to step in. However it 
is a journey of a thousand miles to get something production 
worthy. Python/NumPy/Pandas have had a very large number of 
programmer hours expended on them.  Doing this poorly as a D 
modules is likely worse than not doing it at all.


I think D has a lot to offer the world of data science.

Re: Building and Running Unittests for a Specific Phobos Package Only

2015-10-15 Thread Nordlöw via Digitalmars-d-learn


On Thursday, 15 October 2015 at 13:12:32 UTC, Marc Schütz wrote:

make -f posix.mak std/range.test


Thx

Re: Class, constructor and inherance.

2015-10-15 Thread Rikki Cattermole via Digitalmars-d-learn


On 16/10/15 4:14 PM, holo wrote:

I created interface IfRequestHandler it is used only by one class
RequestHandlerXML right now but thanks to such solution i can create
more classes with same interface which can handle it in different way..
eg second can be RequestHandlerCSVReport or RequestHandlerSendViaEmail.
Is it this what you ware mentioning? Bellow working code:

/
//main.d:
/

#!/usr/bin/rdmd

import std.stdio, sigv4, conf;



void main()
{
 ResultHandlerXML hand = new ResultHandlerXML;
 SigV4 req = new SigV4(hand);

 //req.ResultHandler = hand;
 req.go();
 hand.processResult();

}

/
//conf.d:
/

module conf;

import std.stdio, std.process;
import std.net.curl:exit;

interface IfConfig
{
 void set(string val, string var);
 string get(string var);
}


class Config : IfConfig
{
 this()
 {
 this.accKey = environment.get("AWS_ACCESS_KEY");
 if(accKey is null)
 {
 writeln("accessKey not available");
 exit(-1);
 }
 this.secKey = environment.get("AWS_SECRET_KEY");
 if(secKey is null)
 {
 writeln("secretKey not available");
 exit(-1);
 }
 }

 public:

 void set(string val, string var)
 {
 switch(var)
 {
 case "accKey": accKey = val; break;
 case "secKey": secKey = val; break;
 default: writeln("Can not be set, not such value");
 }
 }


 string get(string var)
 {
 string str = "";

 switch(var)
 {
 case "accKey": return accKey;
 case "secKey": return secKey;
 default: writeln("Can not be get, not such value");
 }

 return str;
 }


  //   private:

 string accKey;
 string secKey;

}

/
//sigv4.d
/

module sigv4;

import std.stdio, std.process;
import std.digest.sha, std.digest.hmac;
import std.string;
import std.conv;
import std.datetime;
import std.net.curl;
import conf;


interface IfSigV4
{
 IfResultHandler go(ResultHandlerXML ResultHandler);
}

interface IfResultHandler
{
 void setResult(int content);
 void processResult();
}

class ResultHandlerXML : IfResultHandler
{
 void setResult(int content)
 {
 this.xmlresult = content;
 }

 void processResult()
 {
 writeln(xmlresult);
 }

 private:
 int xmlresult;
}

class SigV4 : IfSigV4
{
 //could be changed to take some structure as parameter instead of
such ammount of attributes

 this(string methodStr = "GET", string serviceStr = "ec2", string
hostStr = "ec2.amazonaws.com", string regionStr = "us-east-1", string
endpointStr = "https://ec2.amazonaws.com;, string payloadStr = "",
string parmStr = "Action=DescribeInstances")
 in
 {
 writeln(parmStr);
 }
 body
 {
 conf.Config config = new conf.Config;

 this.method = methodStr;
 this.service = serviceStr;
 this.host = hostStr;
 this.region = regionStr;
 this.endpoint = endpointStr;
 this.payload = payloadStr;
 this.requestParameters = parmStr;

 this.accessKey = config.get("accKey");
 if(accessKey is null)
 {
 writeln("accessKey not available");
 exit(-1);
 }
 this.secretKey = config.get("secKey");
 if(secretKey is null)
 {
 writeln("secretKey not available");
 exit(-1);
 }

 }

 public:
 string method;
 string service;
 string host;
 string region;
 string endpoint;
 string payload;
 string requestParameters;

 IfResultHandler ResultHandler;




 IfResultHandler go(ResultHandlerXML ResultHandler)
 {
 //time need to be set when we are sending request not before
 auto currentClock = Clock.currTime(UTC());
 auto currentDate = cast(Date)currentClock;
 auto curDateStr = currentDate.toISOString;
 auto currentTime = cast(TimeOfDay)currentClock;
 auto curTimeStr = currentTime.toISOString;
 auto xamztime = curDateStr ~ "T" ~ curTimeStr ~ "Z";

 canonicalURI = "/";
 canonicalQueryString = requestParameters ~ this.Version;
 canonicalHeadersString =  "host:" ~ this.host ~ "\n" ~
"x-amz-date:" ~ xamztime ~ "\n";
 signedHeaders = "host;x-amz-date";

 auto canonicalRequest = getCanonicalRequest(canonicalURI,
canonicalQueryString, canonicalHeadersString, signedHeaders);

 string credentialScope = curDateStr ~ "/" ~ region ~ "/" ~
service ~ "/" ~ "aws4_request";

 string stringToSign = algorithm ~ "\n" ~ xamztime ~ "\n" ~

Re: OT: why do people use python when it is slow?

2015-10-15 Thread Laeeth Isharc via Digitalmars-d-learn


On Wednesday, 14 October 2015 at 15:25:22 UTC, David DeWitt wrote:
On Wednesday, 14 October 2015 at 14:48:22 UTC, John Colvin 
wrote:

On Wednesday, 14 October 2015 at 14:32:00 UTC, jmh530 wrote:
On Tuesday, 13 October 2015 at 23:26:14 UTC, Laeeth Isharc 
wrote:

https://www.quora.com/Why-is-Python-so-popular-despite-being-so-slow
Andrei suggested posting more widely.


I was just writing some R code yesterday after playing around 
with D for a couple weeks. I accomplished more in an 
afternoon of R coding than I think I had in like a month's 
worth of playing around with D. The same is true for python.


As someone who uses both D and Python every day, I find that - 
once you are proficient in both - initial productivity is 
higher in Python and then D starts to overtake as a project 
gets larger and/or has stricter requirements. I hope never to 
have to write anything longer than a thousand lines in Python 
ever again.


That's true until you need to connect to other systems.  There 
are countless clients built for other systems thats are used in 
real world applications.  With web development the Python code 
really just becomes glue nowadays and api's.  I understand D is 
faster until you have to build the clients for systems to 
connect.  We have an application that uses Postgres, 
ElasticSearch, Kafka, Redis, etc. This is plenty fast and the 
productivity of Python is more than D as the clients for 
Elasticsearch, Postgres and various other systems are 
unavailable or incomplete.  Sure D is faster but when you have 
other real world systems to connect to and time constraints on 
projects how can D be more productive or faster?  Our python 
code essentially becomes the API and usage of clients to other 
systems which handle a majority of the hardcore processing.  
Once D gets established with those clients and they are battle 
tested then I will agree.  To me productivity is more than the 
language itself but also building real world applications in a 
reasonable time-frame.  D will get there but is nowhere near 
where Python is.


Few thoughts:

1. It's easy to embed Python in your D applications.  I do this 
for things like web scraping and when I want to write something 
quick to read simple XML (I just convert to JSON).


2. Of course there is a Redis client.  Elasticsearch is an 
amazing product, but hardly requires much work to have a complete 
API.  I made a start on this, and if I use Elasticsearch more 
then I'll have one done and will release it.  I don't know the 
finer aspects of Postgres to know what is involved.


3. That raises a broader point, which is that it depends on the 
ultimate aim of your project and what it is about the right 
tradeoff between different things.  It will ultimately be much 
more productive for me to do things in D for the reasons John 
alludes to.  A little work to get started is neither here nor 
there in the major scheme of things.  Adam Ruppe made the same 
point - it's not all that much work to put a foundation that 
suits you in place.  You do it once (and maybe add things when 
something like Elasticsearch comes out), and that's it, apart 
from minor updates.  The dollar expenditure on building these 
things is not enormous given the stakes involved for me.  But 
that doesn't mean that you should get to the same answer, as it 
depends.


4. I am not sure that all web development is just glue, or will 
be going forward given what might be on the horizon, but time 
will tell.



Laeeth.

Strange behavior of array

2015-10-15 Thread VlasovRoman via Digitalmars-d-learn

I get it in dmd 2.068.2 and dmd 2.069-b2. I think, that this 
behavior some strange:


I have some code:

enum int m = 10;
enum int n = 5;

ubyte[m][n] array;
for(int x = 0; x < m; x++) {
for(int y = 0; y < n; y++) {
array[x][y] = cast(ubyte)(x + y);
}   
}

In runtime i get range violation error. Helps to change the index 
when accessing the array. What I don't understand?


Thanks.

Re: Strange behavior of array

2015-10-15 Thread Rikki Cattermole via Digitalmars-d-learn


On 16/10/15 3:39 PM, VlasovRoman wrote:

enum int m = 10;
enum int n = 5;

ubyte[m][n] array;
for(int x = 0; x < m; x++) {
 for(int y = 0; y < n; y++) {
 array[x][y] = cast(ubyte)(x + y);
 }
}


First on the left(declaration), last on the right(index/assign).

void main()
{
enum int m = 10;
enum int n = 5;

ubyte[m][n] array;
for(int x = 0; x < m; x++) {
for(int y = 0; y < n; y++) {
array[y][x] = cast(ubyte)(x + y);
}
}   
}

Alias lamda argument vs Type template argument

2015-10-15 Thread Freddy via Digitalmars-d-learn

There are two these different ways to pass functions as template 
arguments. Which is preferred?

---
void funcA(alias calle)()
{
calle();
}

void funcB(T)(T calle)
{
calle();
}

void main()
{
funcA!(() => 0);
funcB(() => 0);
}
---

Re: Strange behavior of array

2015-10-15 Thread VlasovRoman via Digitalmars-d-learn

On Friday, 16 October 2015 at 02:46:03 UTC, Rikki Cattermole 
wrote:

On 16/10/15 3:39 PM, VlasovRoman wrote:

enum int m = 10;
enum int n = 5;

ubyte[m][n] array;
for(int x = 0; x < m; x++) {
 for(int y = 0; y < n; y++) {
 array[x][y] = cast(ubyte)(x + y);
 }
}


First on the left(declaration), last on the right(index/assign).

void main()
{
enum int m = 10;
enum int n = 5;

ubyte[m][n] array;
for(int x = 0; x < m; x++) {
for(int y = 0; y < n; y++) {
array[y][x] = cast(ubyte)(x + y);
}
}   
}


Oh, thank you. Some strange solution.

Re: Class, constructor and inherance.

2015-10-15 Thread holo via Digitalmars-d-learn

I created interface IfRequestHandler it is used only by one class 
RequestHandlerXML right now but thanks to such solution i can 
create more classes with same interface which can handle it in 
different way.. eg second can be RequestHandlerCSVReport or 
RequestHandlerSendViaEmail. Is it this what you ware mentioning? 
Bellow working code:


/
//main.d:
/

#!/usr/bin/rdmd

import std.stdio, sigv4, conf;



void main()
{
ResultHandlerXML hand = new ResultHandlerXML;
SigV4 req = new SigV4(hand);

//req.ResultHandler = hand;
req.go();
hand.processResult();

}

/
//conf.d:
/

module conf;

import std.stdio, std.process;
import std.net.curl:exit;

interface IfConfig
{
void set(string val, string var);
string get(string var);
}


class Config : IfConfig
{
this()
{
this.accKey = environment.get("AWS_ACCESS_KEY");
if(accKey is null)
{
writeln("accessKey not available");
exit(-1);
}
this.secKey = environment.get("AWS_SECRET_KEY");
if(secKey is null)
{
writeln("secretKey not available");
exit(-1);
}
}

public:

void set(string val, string var)
{
switch(var)
{
case "accKey": accKey = val; break;
case "secKey": secKey = val; break;
default: writeln("Can not be set, not such 
value");

}
}


string get(string var)
{
string str = "";

switch(var)
{
case "accKey": return accKey;
case "secKey": return secKey;
default: writeln("Can not be get, not such value");
}

return str;
}


 //   private:

string accKey;
string secKey;

}

/
//sigv4.d
/

module sigv4;

import std.stdio, std.process;
import std.digest.sha, std.digest.hmac;
import std.string;
import std.conv;
import std.datetime;
import std.net.curl;
import conf;


interface IfSigV4
{
IfResultHandler go(ResultHandlerXML ResultHandler);
}

interface IfResultHandler
{
void setResult(int content);
void processResult();
}

class ResultHandlerXML : IfResultHandler
{
void setResult(int content)
{
this.xmlresult = content;
}

void processResult()
{
writeln(xmlresult);
}

private:
int xmlresult;
}

class SigV4 : IfSigV4
{
	//could be changed to take some structure as parameter instead 
of such ammount of attributes


	this(string methodStr = "GET", string serviceStr = "ec2", string 
hostStr = "ec2.amazonaws.com", string regionStr = "us-east-1", 
string endpointStr = "https://ec2.amazonaws.com;, string 
payloadStr = "", string parmStr = "Action=DescribeInstances")

in
{
writeln(parmStr);
}
body
{
conf.Config config = new conf.Config;

this.method = methodStr;
this.service = serviceStr;
this.host = hostStr;
this.region = regionStr;
this.endpoint = endpointStr;
this.payload = payloadStr;
this.requestParameters = parmStr;

this.accessKey = config.get("accKey");
if(accessKey is null)
{
writeln("accessKey not available");
exit(-1);
}
this.secretKey = config.get("secKey");
if(secretKey is null)
{
writeln("secretKey not available");
exit(-1);
}

}

public:
string method;
string service;
string host;
string region;
string endpoint;
string payload;
string requestParameters;

IfResultHandler ResultHandler;




IfResultHandler go(ResultHandlerXML ResultHandler)
{
//time need to be set when we are sending request not 
before
auto currentClock = Clock.currTime(UTC());
auto currentDate = cast(Date)currentClock;
auto curDateStr = currentDate.toISOString;
auto currentTime = cast(TimeOfDay)currentClock;
auto curTimeStr = currentTime.toISOString;
auto xamztime = curDateStr ~ "T" ~ curTimeStr ~ "Z";

canonicalURI = "/";
canonicalQueryString = requestParameters ~ this.Version;
			canonicalHeadersString =  "host:" ~ this.host ~ "\n" ~ 
"x-amz-date:" ~ xamztime ~ "\n";

signedHeaders = "host;x-amz-date";

			auto canonicalRequest = getCanonicalRequest(canonicalURI,

Re: Builtin array and AA efficiency questions

2015-10-15 Thread Random D user via Digitalmars-d-learn

Ah missed your post before replying to H.S. Teoh (I should 
refresh more often).

Thanks for reply.

On Thursday, 15 October 2015 at 19:50:27 UTC, Steven 
Schveighoffer wrote:


Without more context, I would say no. assumeSafeAppend is an 
assumption, and therefore unsafe. If you don't know what is 
passed in, you could potentially clobber data.


In addition, assumeSafeAppend is a non-inlineable, runtime 
function that can *potentially* be low-performing.


Yeah I know that I want to overwrite the data, but still that's 
probably a lot of calls to assumeSafeAppend. So I agree.


instance, you call it on a non-GC array, or one that is not 
marked for appending, you will most certainly need to take the 
GC lock and search through the heap for your block.




What does marked for appending mean. How does it happen or how is 
it marked?


The best place to call assumeSafeAppend is when you are sure 
the array has "shrunk" and you are about to append. If you have 
not shrunk the array, then the call is a waste, if you are not 
sure what the array contains, then you are potentially stomping 
on referenced data.


So assumeSafeAppend is only useful when I have array whose length 
is set to lower than it was originally and I want to grow it back 
(that is arr.length += 1 or arr ~= 1).




An array uses a block marked for appending, assumeSafeAppend 
simply sets how much data is assumed to be valid. Calling 
assumeSafeAppend on a block not marked for appending will do 
nothing except burn CPU cycles.


So yours is not an accurate description.


Related to my question above.
How do you get a block not marked for appending? a view slice?

Perhaps I should re-read the slice article. I believe it had 
something like capacity == 0 --> always allocates. Is it this?




A.3) If A.2 is true, are there any conditions that it reverts 
to

original behavior? (e.g. if I take a new slice of that array)


Any time data is appended, all references *besides* the one 
that was used to append now will reallocate on appending. Any 
time data is shrunk (i.e. arr = arr[0..$-1]), that reference 
now will reallocate on appending.




Thanks. IMO this is very concise description of allocation 
behavior.

I'll use this as a guide.

So when to call really sort of requires understanding what the 
runtime does. Note it is always safe to just never use 
assumeSafeAppend, it is an optimization. You can always append 
to anything (even non-GC array slices) and it will work 
properly.


Out of curiosity. How does this work? Does it always just 
reallocate with gc if it's allocated with something else?




This is an easy call then:

array.reserve(100); // reserve 100 elements for appending
array ~= data; // automatically manages array length for you, 
if length exceeds 100, just automatically reallocates more data.

array.length = 0; // clear all the data
array.assumeSafeAppend; // NOW is the best time to call, 
because you can't shrink it any more, and you know you will be 
appending again.
array ~= data; // no reallocation, unless previous max size was 
exceeded.




Thanks. This will probably cover 90% of cases.
Usually I just want to avoid throwing away memory that I already 
have.

Which is slow if it's all over your codebase.
Like re-reading or recomputing variables that you already have.
One doesn't hurt but a hundred does.

B.1) I have a temporary AA whose lifetime is limited to a 
known span
(might be a function or a loop with couple functions). Is 
there way to

tell the runtime to immeditially destroy and free the AA?


There isn't. This reminds me, I have a lingering PR to add 
aa.clear which destroys all the elements, but was waiting until 
object.clear had been removed for the right amount of time. 
Perhaps it's time to revive that.


Should array have clear() as well?
Basically wrap array.length = 0; array.assumeSafeAppend();
At least it would then be symmetric (and more intuitive) with 
built-in containers.




-Steve

Re: OT: why do people use python when it is slow?

2015-10-15 Thread Laeeth Isharc via Digitalmars-d-learn

On Wednesday, 14 October 2015 at 22:11:56 UTC, data pulverizer 
wrote:
On Tuesday, 13 October 2015 at 23:26:14 UTC, Laeeth Isharc 
wrote:

https://www.quora.com/Why-is-Python-so-popular-despite-being-so-slow
Andrei suggested posting more widely.


I am coming at D by way of R, C++, Python etc. so I speak as a 
statistician who is interested in data science applications.


Welcome...  Looks like we have similar interests.

To sit on the deployment side, D needs to grow it's big 
data/noSQL infrastructure for a start, then hook into a whole 
ecosystem of analytic tools in an easy and straightforward 
manner. This will take a lot of work!


Indeed.  The dlangscience project managed by John Colvin is very 
interesting.  It is not a pure stats project, but there will be 
many shared areas of need.  He has some v interesting ideas, and 
being able to mix Python and D in a Jupyter notebook is rather 
nice (you can do this already).


I believe it is easier and more effective to start on the 
research side. D will need:


1. A data table structure like R's data.frame or data.table. 
This is a dynamic data structure that represents a table that 
can have lots of operations applied to it. It is the data 
structure that separates R from most programming languages. It 
is what pandas tries to emulate. This includes text file and 
database i/o from mySQL and ODBC for a start.


I fully agree, and have made a very simple start on this.  See 
github. It's usable for my needs as they stand, although far from 
production ready or elegant.  You can read and write to/from CSV 
and HDF5.  I guess mysql and ODBC wouldn't be hard to add, but I 
don't myself need for now and won't have time to do myself.  If I 
have space I may channel some reesources in that direction some 
time next year.


2. Formula class : the ability to talk about statistical models 
using formulas e.g. y ~ x1 + x2 + x3 etc and then use these 
formulas to generate model matrices for input into statistical 
algorithms.


Sounds interesting.  Take a look at Colvin's dlang science draft 
white paper, and see what you would add.  It's a chance to shape 
things whilst they are still fluid.


3. Solid interface to a big data database, that allows a D data 
table <-> database easily


Which ones do you have in mind for stats?  The different choices 
seem to serve quite different needs.  And when you say big data, 
how big do you typically mean ?


4. Functional programming: especially around data table and 
array structures. R's apply(), lapply(), tapply(), plyr and now 
data.table(,, by = list()) provides powerful tools for data 
manipulation.


Any thoughts on what the design should look like?

To an extent there is a balance between wanting to explore data 
iteratively (when you don't know where you will end up), and 
wanting to build a robust process for production.  I have been 
wondering myself about using LuaJIT to strap together D building 
blocks for the exploration (and calling it based on a custom 
console built around Adam Ruppe's terminal).


5. A factor data type:for categorical variables. This is easy 
to implement! This ties into the creation of model matrices.


6. Nullable types makes talking about missing data more 
straightforward and gives you the opportunity to code them into 
a set value in your analysis. D is streaks ahead of Python 
here, but this is built into R at a basic level.


So matrices with nullable types within?  Is nan enough for you ?  
If not then could be quite expensive if back end is C.


If D can get points 1, 2, 3 many people would be all over D 
because it is a fantastic programming language and is wicked 
fast.
What do you like best about it ?  And in your own domain, what 
have the biggest payoffs been in practice?

Re: OT: why do people use python when it is slow?

2015-10-15 Thread Laeeth Isharc via Digitalmars-d-learn


On Thursday, 15 October 2015 at 07:57:51 UTC, Russel Winder wrote:
On Thu, 2015-10-15 at 06:48 +, data pulverizer via 
Digitalmars-d- learn wrote:



[…]

A journey of a thousand miles ...


Exactly.


I tried to start creating a data table type object by
investigating variantArray:
http://forum.dlang.org/thread/hhzavwrkbrkjzfohc...@forum.dlang.org
 but hit the snag that D is a static programming language and 
may not
allow the kind of behaviour you need for creating the same 
kind of

behaviour you need in data table - like objects.

I envisage such an object as being composed of arrays of 
vectors where each vector represents a column in a table as in 
R - easier for model matrix creation. Some people believe that 
you should work with arrays of tuple rows - which may be more 
big data friendly. I am not overly wedded to either approach.


Anyway it seems I have hit an inherent limitation in the 
language. Correct me if I am wrong. The data frame needs to 
have dynamic behaviour bind rows and columns and return parts 
of itself as a data table etc and since D is a static language 
we cannot do this.


Just because D doesn't have this now doesn't mean it cannot. C 
doesn't have such capability but R and Python do even though R 
and CPython are just C codes.


Pandas data structures rely on the NumPy n-dimensional array 
implementation, it is not beyond the bounds of possibility that 
that data structure could be realized as a D module.


Is R's data.table written in R or in C? In either case, it is 
not beyond the bounds of possibility that that data structure 
could be realized as a D module.


The core issue is to have a seriously efficient n-dimensional 
array that is amenable to data parallelism and is extensible. 
As far as I am aware currently (I will investigate more) the 
NumPy array is a good native code array, but has some issues 
with data parallelism and Pandas has to do quite a lot of work 
to get the extensibility. I wonder how the R data.table works.


I have this nagging feeling that like NumPy, data.table seems a 
lot better than it could be. From small experiments D is (and 
also Chapel is even more) hugely faster than Python/NumPy at 
things Python people think NumPy is brilliant for. Expectations 
of Python programmers are set by the scale of Python 
performance, so NumPy seems brilliant. Compared to the scale 
set by D and Chapel, NumPy is very disappointing. I bet the 
same is true of R (I have never really used R).


This is therefore an opportunity for D to step in. However it 
is a journey of a thousand miles to get something production 
worthy. Python/NumPy/Pandas have had a very large number of 
programmer hours expended on them.  Doing this poorly as a D 
modules is likely worse than not doing it at all.


I think it's much better to start, which means solving your own 
problems in a way that is acceptable to you rather than letting 
perfection be the enemy of the good.  It's always easier to do 
something a second time too, as you learn from successes and 
mistakes and you have a better idea about what you want.  Of 
course it's better to put some thought into design early on, but 
that shouldn't end up in analysis paralysis.  John Colvin and 
others are putting quite a lot of thought into dlang science, it 
seems to me, but he is also getting stuff done.  Running D in a 
Jupyter notebook is something very useful.  It doesn't matter 
that it's cosmetically imperfect at this stage, and it won't stay 
that way.  And that's just a small step towards the bigger goal.

Re: Builtin array and AA efficiency questions

2015-10-15 Thread anonymous via Digitalmars-d-learn

On Thursday, October 15, 2015 11:48 PM, Random D user wrote:

> Should array have clear() as well?
> Basically wrap array.length = 0; array.assumeSafeAppend();
> At least it would then be symmetric (and more intuitive) with 
> built-in containers.

No. "clear" is too harmless a name for it to involve an unsafe operation 
like assumeSafeAppend. With containers there is always one container that 
owns the data. There is no such notion with dynamic arrays.

Re: Builtin array and AA efficiency questions

2015-10-15 Thread Steven Schveighoffer via Digitalmars-d-learn


On 10/15/15 12:47 PM, Random D user wrote:

So I was doing some optimizations and I came up with couple basic
questions...

A)
What does assumeSafeAppend actually do?
A.1) Should I call it always if before setting length if I want to have
assumeSafeAppend semantics? (e.g. I don't know if it's called just
before the function I'm in)


Without more context, I would say no. assumeSafeAppend is an assumption, 
and therefore unsafe. If you don't know what is passed in, you could 
potentially clobber data.


In addition, assumeSafeAppend is a non-inlineable, runtime function that 
can *potentially* be low-performing. If, for instance, you call it on a 
non-GC array, or one that is not marked for appending, you will most 
certainly need to take the GC lock and search through the heap for your 
block.


The best place to call assumeSafeAppend is when you are sure the array 
has "shrunk" and you are about to append. If you have not shrunk the 
array, then the call is a waste, if you are not sure what the array 
contains, then you are potentially stomping on referenced data.


Calling it just after shrinking every time is possible, but could 
potentially be sub-optimal, if you don't intend to append to that array 
again, or you intend to shrink it again before appending.



A.2) Or does it mark the array/slice itself as a "safe append" array?
And I can call it once.


An array uses a block marked for appending, assumeSafeAppend simply sets 
how much data is assumed to be valid. Calling assumeSafeAppend on a 
block not marked for appending will do nothing except burn CPU cycles.


So yours is not an accurate description.


A.3) If A.2 is true, are there any conditions that it reverts to
original behavior? (e.g. if I take a new slice of that array)


Any time data is appended, all references *besides* the one that was 
used to append now will reallocate on appending. Any time data is shrunk 
(i.e. arr = arr[0..$-1]), that reference now will reallocate on appending.


So when to call really sort of requires understanding what the runtime 
does. Note it is always safe to just never use assumeSafeAppend, it is 
an optimization. You can always append to anything (even non-GC array 
slices) and it will work properly.



I read the array/slice article, but is seems that I still can't use them
with confidece that it actually does what I want. I tried also look into
lifetime.d, but there's so many potential entry/exit/branch paths that
without case by case debugging (and no debug symbols for phobos.lib)
it's bit too much.


I recommend NOT to try and understand lifetime.d, it's very complex, and 
the entry points are mostly defined by the compiler. I had to use trial 
and error to understand what happened when.



What I'm trying to do is a reused buffer which only grows in capacity
(and I want to overwrite all data). Preferably I'd manage the current
active size of the buffer as array.length.

For a buffer typical pattern is:
array.length = 100

array.length = 0

some appends

array.length = 50

etc.


This is an easy call then:

array.reserve(100); // reserve 100 elements for appending
array ~= data; // automatically manages array length for you, if length 
exceeds 100, just automatically reallocates more data.

array.length = 0; // clear all the data
array.assumeSafeAppend; // NOW is the best time to call, because you 
can't shrink it any more, and you know you will be appending again.

array ~= data; // no reallocation, unless previous max size was exceeded.


B.1) I have a temporary AA whose lifetime is limited to a known span
(might be a function or a loop with couple functions). Is there way to
tell the runtime to immeditially destroy and free the AA?


There isn't. This reminds me, I have a lingering PR to add aa.clear 
which destroys all the elements, but was waiting until object.clear had 
been removed for the right amount of time. Perhaps it's time to revive that.


-Steve

Re: Builtin array and AA efficiency questions

2015-10-15 Thread H. S. Teoh via Digitalmars-d-learn

On Thu, Oct 15, 2015 at 09:00:36PM +, Random D user via Digitalmars-d-learn 
wrote:
> Thanks for thorough answer.
> 
> On Thursday, 15 October 2015 at 18:46:22 UTC, H. S. Teoh wrote:
[...]
> >The only thing I can think of is to implement this manually, e.g., by
> >wrapping your AA in a type that keeps a size_t "generation counter",
> >where if any value in the AA is found to belong to a generation
> >that's already past, it pretends that the value doesn't exist yet.
> >Something like this:
> 
> Right. Like a handle system or AA of ValueHandles in this case. But
> I'll probably just hack up some custom map and reuse it's mem.
> Although, I'm mostly doing this for perf (realloc) and not mem size,
> so it might be too much effort if D AA is highly optimized.

Haha, the current AA implementation is far from being highly optimized.
There has been a slow trickle of gradual improvements over the years,
but if you want maximum performance, you're probably better off writing
a specialized hash map that fits your exact use case better. Or use a
different container that's more cache-friendly. (Hashes exhibit poor
locality, because they basically ensure random memory access patterns,
so your hardware prefetcher's predictions are out the window and it's
almost a guaranteed RAM roundtrip per hash lookup.)

T

-- 
If I were two-faced, would I be wearing this one? -- Abraham Lincoln

Re: Builtin array and AA efficiency questions

2015-10-15 Thread Random D user via Digitalmars-d-learn


Thanks for thorough answer.

On Thursday, 15 October 2015 at 18:46:22 UTC, H. S. Teoh wrote:


It adjusts the size of the allocated block in the GC so that 
subsequent appends will not reallocate.




So how does capacity affect this? I mean what is exactly a GC 
block here.


Shrink to fit bit was confusing, but after thinking about this 
few mins I guess there's like at least three concepts:


slice  0 .. length
allocation 0 .. max used/init size (end of 'gc block', also 
shared between slices)
raw mem block  0 .. capacity (or whatever gc set aside (like 
pages))


slice is managed by slice instance (ptr, length pair)
allocation is managed by array runtime (max used by some array)
raw mem block is managed by gc (knows the actual mem block)

So if slice.length != allocation.length then slice is not an mem 
"owning" array (it's a reference).
And assumeSafeAppend sets allocation.length to slice.length i.e. 
shrinks to fit. (slice.length > allocation.length not possible, 
because allocation.length = max(slice.length), so it always just 
shrinks)
Now that slice is a mem "owning" array it owns length growing 
length happens without reallocation until it hits raw mem 
block.length (aka capacity).


So basically the largest slice owns the memory allocation and 
it's length.


This is my understanding now. Although, I'll probably forget all 
this in 5..4..3..2...


The thought that occurs to me is that you could still use the 
built-in arrays as a base for your Buffer type, but with 
various operators overridden so that it doesn't reallocate 
unnecessarily.


Right, so custom array/buffer type it is. Seems the simplest 
solution.
I already started implementing this. Reusable arrays are 
everywhere.


If you want to manually delete data, you probably want to 
implement your own AA based on malloc/free instead of the GC. 
The nature of GC doesn't lend it well to manual management.


I'll have to do this as well. Although, this one isn't that 
critical for me.


The only thing I can think of is to implement this manually, 
e.g., by wrapping your AA in a type that keeps a size_t 
"generation counter", where if any value in the AA is found to 
belong to a generation that's already past, it pretends that 
the value doesn't exist yet.  Something like this:


Right. Like a handle system or AA of ValueHandles in this case. 
But I'll probably just hack up some custom map and reuse it's 
mem. Although, I'm mostly doing this for perf (realloc) and not 
mem size, so it might be too much effort if D AA is highly 
optimized.

Re: Alias lamda argument vs Type template argument

2015-10-15 Thread Rikki Cattermole via Digitalmars-d-learn


On 16/10/15 4:02 PM, Freddy wrote:

There are two these different ways to pass functions as template
arguments. Which is preferred?
---
void funcA(alias calle)()
{
 calle();
}

void funcB(T)(T calle)
{
 calle();
}

void main()
{
 funcA!(() => 0);
 funcB(() => 0);
}
---


Depends, do you need it at compile time or at runtime?
funcA is at compile time and funcB is at runtime.
If at runtime, you'll probably want to define it anyway and not bother 
with templates.


If you are passing it in for compile time, you are probably doing it for 
usage with traits. In any case, your better off calling by runtime args. 
Since you know the arguments and the return type. At least per your example.

Re: Builtin array and AA efficiency questions

2015-10-15 Thread Mike Parker via Digitalmars-d-learn


On Thursday, 15 October 2015 at 21:48:29 UTC, Random D user wrote:

An array uses a block marked for appending, assumeSafeAppend 
simply sets how much data is assumed to be valid. Calling 
assumeSafeAppend on a block not marked for appending will do 
nothing except burn CPU cycles.


So yours is not an accurate description.


Related to my question above.
How do you get a block not marked for appending? a view slice?

Perhaps I should re-read the slice article. I believe it had 
something like capacity == 0 --> always allocates. Is it this?


There are a handful of attributes that can be set on memory 
allocated by the GC. See the BlkAttr enumeration in core.memory 
[1]. Under the hood, memory for dynamic arrays (slices) is marked 
with BlkAttr.APPENDABLE. If an array pointing to memory not 
marked as such, either manually allocated through the GC, through 
malloc, or another source, then assumeSafeAppend can't help you.


capacity tells you how many more elements can be appended to a 
dynamic array (slice) before an allocation will be triggered. So 
if you get a 0, that means the next append will trigger one. 
Consider this:


int[] dynarray = [1, 2, 3, 4, 5];
auto slice = dynarray[0 .. $-1];

slice points to the same memory as dynarray, but has 4 elements 
whereas dynarray has 5. Appending a single element to slice 
without reallocating will overwrite the 5 in that memory block, 
meaning dynarray will see the new value. For that reason, new 
slices like this will always have a 0 capacity. Append a new item 
to slice and a reallocation occurs, copying the existing elements 
of slice over and adding the new one. This way, dynarray's values 
are untouched and both arrays point to different blocks of memory.


assumeSafeAppend changes this behavior such that appending a new 
item to slice will reuse the same memory block and causing the 5 
to be overwritten. Normally, you don't want to use it unless you 
are sure there are no other slices pointing to the same memory 
block. So it's not something you should be using in a function 
that can receive an array from any source. That array might share 
memory with other slices, the block might not be appendable, you 
have no idea how the slice is actually used... just a bad idea. 
When you have complete control over a slice and know exactly how 
it is used, such as an internal buffer, then it becomes a useful 
tool.


[1] http://dlang.org/phobos/core_memory.html#.GC.BlkAttr

Re: Strange behavior of array

2015-10-15 Thread Mike Parker via Digitalmars-d-learn


On Friday, 16 October 2015 at 03:01:12 UTC, VlasovRoman wrote:


Oh, thank you. Some strange solution.


D doesn't have multidimensional built-in arrays, but rectangular 
arrays. Think of it this way:


int[3] a1;

a1 is a static array of 3 ints. Indexing it returns an int. We 
can think of it like this:


(int)[3]

On the same lines:

int[3][4] a2;

a2 is a static array of 4 static arrays of 3 ints. In other words:

(int[3])[4].

Therefore, int[0] returns the first int[3], int[1] the second, 
and so on.


int[0][1] returns the second element of the first int[3].

Rikki's solution to your problem was to reverse the indexes when 
reading the array. But if you want to index it just as you would 
in C or C++, you should reverse the indexes in the declaration. 
Where you declare int[rows][columns] in C, you would declare 
int[columns][rows] in D, then reading from them is identical.

Re: OT: why do people use python when it is slow?

2015-10-15 Thread jmh530 via Digitalmars-d-learn


On Thursday, 15 October 2015 at 10:33:54 UTC, Russel Winder wrote:


CUDA is of course doomed in the long run as Intel put GPGPU on 
the processor chip. OpenCL will eventually be replaced with 
Vulkan (assuming they can get the chips made).


I thought Vulkan was meant to replace OpenGL.

Re: OT: why do people use python when it is slow?

2015-10-15 Thread Russel Winder via Digitalmars-d-learn

On Thu, 2015-10-15 at 17:00 +, jmh530 via Digitalmars-d-learn
wrote:
> On Thursday, 15 October 2015 at 10:33:54 UTC, Russel Winder wrote:
> > 
> > CUDA is of course doomed in the long run as Intel put GPGPU on 
> > the processor chip. OpenCL will eventually be replaced with 
> > Vulkan (assuming they can get the chips made).
> 
> I thought Vulkan was meant to replace OpenGL.

True, but there is an intent to try and have Vulkan allow for replacing
both OpenGL and OpenCL.

-- 
Russel.
=
Dr Russel Winder  t: +44 20 7585 2200   voip: sip:russel.win...@ekiga.net
41 Buckmaster Roadm: +44 7770 465 077   xmpp: rus...@winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder



signature.asc
Description: This is a digitally signed message part

Builtin array and AA efficiency questions

2015-10-15 Thread Random D user via Digitalmars-d-learn

So I was doing some optimizations and I came up with couple basic 
questions...


A)
What does assumeSafeAppend actually do?
A.1) Should I call it always if before setting length if I want 
to have assumeSafeAppend semantics? (e.g. I don't know if it's 
called just before the function I'm in)
A.2) Or does it mark the array/slice itself as a "safe append" 
array? And I can call it once.
A.3) If A.2 is true, are there any conditions that it reverts to 
original behavior? (e.g. if I take a new slice of that array)


I read the array/slice article, but is seems that I still can't 
use them with confidece that it actually does what I want. I 
tried also look into lifetime.d, but there's so many potential 
entry/exit/branch paths that without case by case debugging (and 
no debug symbols for phobos.lib) it's bit too much.


What I'm trying to do is a reused buffer which only grows in 
capacity (and I want to overwrite all data). Preferably I'd 
manage the current active size of the buffer as array.length.


For a buffer typical pattern is:
array.length = 100
...
array.length = 0
...
some appends
...
array.length = 50
...
etc.

There's just so much magic going behind d arrays that it's a bit 
cumbersome to track manually what's actually happening. When it 
allocates and when it doesn't.
So, I already started doing my own Buffer type which gives me 
explicit control, but I wonder if there's a better way.


B.1) I have a temporary AA whose lifetime is limited to a known 
span (might be a function or a loop with couple functions). Is 
there way to tell the runtime to immeditially destroy and free 
the AA?


I'd like to assist the gc with manually destroying some AAs that 
I know I don't need anymore. I don't really want to get rid of 
gc, I just don't want to just batch it into some big batch of gc 
cycle work, since I know right then and there that I'm done with 
it.


For arrays you can do:
int[] arr;
arr.length = 100;
delete arr; // I assume this frees it

but for AAs:
int[string] aa;
delete aa; // gives compiler error  Error: cannot delete type 
int[string]


I could do aa.destroy(), but that just leaves it to gc according 
to docs.


Maybe I should start writing my own hashmap type as well?

B.2) Is there a simple way to reuse the memory/object of the AA?

I could just reuse a preallocated temp AA instead of 
alloc/freeing it.

Re: Builtin array and AA efficiency questions

2015-10-15 Thread H. S. Teoh via Digitalmars-d-learn

On Thu, Oct 15, 2015 at 04:47:35PM +, Random D user via Digitalmars-d-learn 
wrote:
> So I was doing some optimizations and I came up with couple basic
> questions...
> 
> A)
> What does assumeSafeAppend actually do?

It adjusts the size of the allocated block in the GC so that subsequent
appends will not reallocate.

Basically, whenever you try to append to an array and the end of the
array is not the same as the end of the allocated GC block, the GC will
conservatively assume that somebody else has an array (i.e. slice) that
points to the data between the end of the array and the end of the
block, so it will allocate a new block and copy the array to the new
block before appending the new data. Calling assumeSafeAppend "shrink
fits" the allocated GC block to the end of the array, so that the GC
won't reallocate, but simply extend the block to accomodate the new
element.

> A.1) Should I call it always if before setting length if I want to
> have assumeSafeAppend semantics? (e.g. I don't know if it's called
> just before the function I'm in)

Probably, otherwise the GC may sometimes reallocate when you don't want
it to.

> A.2) Or does it mark the array/slice itself as a "safe append" array?
> And I can call it once.

Not that I know of.

> A.3) If A.2 is true, are there any conditions that it reverts to
> original behavior? (e.g. if I take a new slice of that array)
[...]
> What I'm trying to do is a reused buffer which only grows in capacity
> (and I want to overwrite all data). Preferably I'd manage the current
> active size of the buffer as array.length.
[...]
> There's just so much magic going behind d arrays that it's a bit
> cumbersome to track manually what's actually happening. When it
> allocates and when it doesn't.
> So, I already started doing my own Buffer type which gives me explicit
> control, but I wonder if there's a better way.

This is probably the best way to do it, since the built-in arrays do
have a lot of "interesting" quirks that probably don't really do what
you want.

The thought that occurs to me is that you could still use the built-in
arrays as a base for your Buffer type, but with various operators
overridden so that it doesn't reallocate unnecessarily. So you'd keep a
T[] as the underlying array, but keep track of .length separately and
override the ~ and ~= operators so that they update Buffer.length
instead of the .length of the underlying array. Only when Buffer.length
is greater than .length, you'd increment .length so that the GC will
reallocate as needed. Similarly, you might want to override the slicing
operators as well so that they also return Buffer types instead of T[],
so that the user doesn't accidentally get access to the raw T[] and
cause unnecessary reallocations.

> B.1) I have a temporary AA whose lifetime is limited to a known span
> (might be a function or a loop with couple functions). Is there way to
> tell the runtime to immeditially destroy and free the AA?
> 
> I'd like to assist the gc with manually destroying some AAs that I
> know I don't need anymore. I don't really want to get rid of gc, I
> just don't want to just batch it into some big batch of gc cycle work,
> since I know right then and there that I'm done with it.
> 
> For arrays you can do:
> int[] arr;
> arr.length = 100;
> delete arr; // I assume this frees it

Unfortunately, delete has been deprecated, and may not be around for
very much longer.

> but for AAs:
> int[string] aa;
> delete aa; // gives compiler error  Error: cannot delete type int[string]
> 
> I could do aa.destroy(), but that just leaves it to gc according to docs.

Perhaps what you could do is to trigger GC collection after setting the
AA to null:

aa = null;  // delete references to GC data
GC.collect();   // run collection cycle to free it

I'm not sure if it's a good idea to run collection cycles too often,
though, it will have performance impact.

> Maybe I should start writing my own hashmap type as well?

If you want to manually delete data, you probably want to implement your
own AA based on malloc/free instead of the GC. The nature of GC doesn't
lend it well to manual management.

> B.2) Is there a simple way to reuse the memory/object of the AA?
> 
> I could just reuse a preallocated temp AA instead of alloc/freeing it.

Not that I know of... unfortunately, the current AA implementation
doesn't allow overriding of the allocator; it's hardcoded to use the
default GC.  This may change in the distant future, but I don't see it
happening anytime soon.

The only thing I can think of is to implement this manually, e.g., by
wrapping your AA in a type that keeps a size_t "generation counter",
where if any value in the AA is found to belong to a generation that's
already past, it pretends that the value doesn't exist yet.  Something
like this:

struct AA(K,V) {
static struct WrappedValue {
size_t generation;
V value;

38 matches

Mail list logo