Re: mmap file performance

2024-04-15 Thread Patrick Schluter via Digitalmars-d-learn

On Thursday, 11 April 2024 at 00:24:44 UTC, Andy Valencia wrote:
I wrote a "count newlines" based on mapped files.  It used 
about twice the CPU of the version which just read 1 meg at a 
time.  I thought something was amiss (needless slice 
indirection or something), so I wrote the code in C.  It had 
the same CPU usage as the D version.  So...mapped files, not so 
much.  Not D's fault.  And writing it in C made me realize how 
much easier it is to code in D!


[...]


The setup of a memory mapped file is relatively costly. For 
smaller files it is a net loss and read/write beats it hands 
down. Furthermore, sequential access is not the best way to 
exploit the advantages of mmap. Full random access is the strong 
suite of mmap as it replaces kernel syscalls (lseek,read, write 
or pread, pwrite) by user land processing.
You could try MAP_POPULATE option in the mmap as it enables 
read-ahead on the file which may help on sequential code.


Re: Idiomatic D using GC as a library writer

2022-12-05 Thread Patrick Schluter via Digitalmars-d-learn

On Sunday, 4 December 2022 at 23:37:39 UTC, Ali Çehreli wrote:

On 12/4/22 15:25, Adam D Ruppe wrote:

> which would trigger the write barrier. The thread isn't
> allowed to complete this operation until the GC is done.

According to my limited understanding of write barriers, the 
thread moving to 800 could continue because order of memory 
operations may have been satisfied. What I don't see is, what 
would the GC thread be waiting for about the write to 800?


I'm not a specialist but I have the impression that GC write 
barrier and CPU memory ordering write barriers are 2 different 
things that confusedly use the same term for 2 completely 
different concepts.




Would the GC be leaving behind writes to every page it scans, 
which have barriers around so that the other thread can't 
continue? But then the GC's write would finish and the other 
thread's write would finish.


Ok, here is the question: Is there a very long standing partial 
write that the GC can perform like: "I write to 0x42, but I 
will finish it 2 seconds later. So, all other writes should 
wait?"


> The GC finishes its work and releases the barriers.

So, it really is explicit acquisition and releasing of these 
barriers... I think this is provided by the CPU, not the OS. 
How many explicit write barriers are there?


Ali





Re: Float rounding (in JSON)

2022-10-14 Thread Patrick Schluter via Digitalmars-d-learn
On Thursday, 13 October 2022 at 19:27:22 UTC, Steven 
Schveighoffer wrote:

On 10/13/22 3:00 PM, Sergey wrote:

[...]


It doesn't look really that far off. You can't expect floating 
point parsing to be exact, as floating point does not perfectly 
represent decimal numbers, especially when you get down to the 
least significant bits.


[...]
To me it looks like there is a conversion to `real` (80 bit 
floats) somewhere in the D code and that the other languages stay 
in `double` mode everywhere. Maybe forcing `double` by disabling 
x87 on the D side would yield the same results as the other 
languages?




Re: Replacing tango.text.Ascii.isearch

2022-10-13 Thread Patrick Schluter via Digitalmars-d-learn

On Thursday, 13 October 2022 at 08:27:17 UTC, bauss wrote:
On Wednesday, 5 October 2022 at 17:29:25 UTC, Steven 
Schveighoffer wrote:

On 10/5/22 12:59 PM, torhu wrote:
I need a case-insensitive check to see if a string contains 
another string for a "quick filter" feature. It should 
preferrably be perceived as instant by the user, and needs to 
check a few thousand strings in typical cases. Is a regex the 
best option, or what would you suggest?


https://dlang.org/phobos/std_uni.html#asLowerCase

```d
bool isearch(S1, S2)(S1 haystack, S2 needle)
{
import std.uni;
import std.algorithm;
return haystack.asLowerCase.canFind(needle.asLowerCase);
}
```

untested.

-Steve


This doesn't actually work properly in all languages. It will 
probably work in most, but it's not entirely correct.


Ex. Turkish will not work with it properly.


Greek will also be problematic. 2 different lowercase sigmas but 
only 1 uppercase. Other languages that may make issues, German 
where normally ß uppercases as SS (or not) but not the other way 
round, but here we already arrived to Unicode land and the 
normalization conundrum.





Re: Programs in D are huge

2022-08-19 Thread Patrick Schluter via Digitalmars-d-learn
On Thursday, 18 August 2022 at 17:15:12 UTC, rikki cattermole 
wrote:


On 19/08/2022 4:56 AM, IGotD- wrote:
BetterC means no arrays or strings library and usually in 
terminal tools you need to process text. Full D is wonderful 
for such task but betterC would be limited unless you want to 
write your own array and string functionality.


Unicode support in Full D isn't complete.

There is nothing in phobos to even change case correctly!

Both are limited if you care about certain stuff like non-latin 
based languages like Turkic.


A general toupper/tolower for Unicode is doomed to fail. As 
already mentioned Turkish has its specificity, but other 
languages also have traps. In Greek toupper/tolower are not 
reversible i.e. `x.toupper.tolower == x` is not guaranteed . Some 
languages have 1 codepoint input and 2 codepoints as result 
(German ß becomes SS in most cases, capital ẞ is not the right 
choice in most cases).

etc. etc.


Re: A look inside "filter" function defintion

2022-08-09 Thread Patrick Schluter via Digitalmars-d-learn

On Tuesday, 2 August 2022 at 12:39:41 UTC, pascal111 wrote:

On Tuesday, 2 August 2022 at 04:06:30 UTC, frame wrote:

On Monday, 1 August 2022 at 23:35:13 UTC, pascal111 wrote:
This is the definition of "filter" function, and I think it 
called itself within its definition. I'm guessing how it 
works?


It's a template that defines the function called "Eponymous 
Templates":

https://dlang.org/spec/template.html#implicit_template_properties

A template generates code, it cannot be called, only 
instantiated.


The common syntax is just a shortcut for using it. Otherwise 
you would need to write `filter!(a => a > 0).filter([1, -1, 2, 
0, -3])`. Like UFCS, some magic the compiler does for you.


Instantiation seems some complicated to me. I read "If a 
template contains members whose name is the same as the 
template identifier then these members are assumed to be 
referred to in a template instantiation:" in the provided link, 
but I'm still stuck. Do you have a down-to-earth example for 
beginners to understand this concept?


A template is conceptually like a macro with parameters in C. An 
instantiation is like the using of the macro in your C program. 
The fundamental difference is, that the template is syntactically 
and semantically linked to the language. In C, the preprocessor 
was just a textual replacement done before the proper 
compilation. This meant that there are things that you couldn't 
do in the pre-processor (like `#if sizeof(int)==4`) and 
(horrible) things that never should have been possible (I used to 
use the C pre-processor with other languages like AutoLISP and 
dBase III).





Re: Make shared static this() encoding table compilable

2022-03-17 Thread Patrick Schluter via Digitalmars-d-learn
On Thursday, 17 March 2022 at 12:19:36 UTC, Patrick Schluter 
wrote:
On Thursday, 17 March 2022 at 12:11:19 UTC, Patrick Schluter 
wrote:
On Thursday, 17 March 2022 at 11:36:40 UTC, Patrick Schluter 
wrote:



[...]

Something akin to
```d
auto lookup(ushort key)
{
  return cp949[key-0x8141];
}

[...]


Takes 165 ms to compile with dmd 2.094.2 -O on [godbolt] with 
the whole table generated from the Unicode link.


[godbolt]: https://godbolt.org/z/hEzP7rKnn]


Upps, remove the ] at the end of the link to [godbolt]

[godbolt]: https://godbolt.org/z/hEzP7rKnn


Re: Make shared static this() encoding table compilable

2022-03-17 Thread Patrick Schluter via Digitalmars-d-learn
On Thursday, 17 March 2022 at 12:11:19 UTC, Patrick Schluter 
wrote:
On Thursday, 17 March 2022 at 11:36:40 UTC, Patrick Schluter 
wrote:



[...]

Something akin to
```d
auto lookup(ushort key)
{
  return cp949[key-0x8141];
}

[...]


Takes 165 ms to compile with dmd 2.094.2 -O on [godbolt] with the 
whole table generated from the Unicode link.


[godbolt]: https://godbolt.org/z/hEzP7rKnn]


Re: Make shared static this() encoding table compilable

2022-03-17 Thread Patrick Schluter via Digitalmars-d-learn
On Thursday, 17 March 2022 at 11:36:40 UTC, Patrick Schluter 
wrote:

On Monday, 14 March 2022 at 09:40:00 UTC, zhad3 wrote:
Hey everyone, I am in need of some help. I have written this 
Windows CP949 encoding table 
https://github.com/zhad3/zencoding/blob/main/windows949/source/zencoding/windows949/table.d which is used to convert CP949 to UTF-16.


After some research about how to initialize immutable 
associative arrays people suggested using `shared static 
this()`. So far this worked for me, but I recently discovered 
that DMD cannot compile this in release mode with 
optimizations.


`dub build --build=release`  or `dmd` with `-release -O` fails:

```
code  windows949
function  
zencoding.windows949.fromWindows949!(immutable(ubyte)[]).fromWindows949

code  table
function  zencoding.windows949.table._sharedStaticCtor_L29_C1
dmd failed with exit code -11.
```

I usually compile my projects using LDC where this works fine, 
but I don't want to force others to use LDC because of this 
one problem.


Hence I'd like to ask on how to change the code so that it 
compiles on DMD in release mode (with optimizations). I 
thought about having a computational algorithm instead of an 
encoding table but sadly I could not find any references in 
that regard. Apparently encoding tables seem to be the 
standard.


Why not use a simple static array (not an associative array). 
Where the values are indexed on `key - min(keys)`. Even with 
the holes in the keys (i.e. keys that do not have corresponding 
values) it will be smaller that the constructed associative 
array? The lookup is also faster.

Something akin to
```d
auto lookup(ushort key)
{
  return cp949[key-0x8141];
}

immutable ushort[0xFDFE-0x8141+1] cp949 = [
0x8141-0x8141: 0xAC02,
0x8142-0x8141: 0xAC03,
0x8143-0x8141: 0xAC05,
0x8144-0x8141: 0xAC06,
0x8145-0x8141: 0xAC0B,
0x8146-0x8141: 0xAC0C,
0x8147-0x8141: 0xAC0D,
0x8148-0x8141: 0xAC0E,
0x8149-0x8141: 0xAC0F,
0x814A-0x8141: 0xAC18,
0x814B-0x8141: 0xAC1E,
0x814C-0x8141: 0xAC1F,
0x814D-0x8141: 0xAC21,
0x814E-0x8141: 0xAC22,
0x814F-0x8141: 0xAC23,
0x8150-0x8141: 0xAC25,
0x8151-0x8141: 0xAC26,
0x8152-0x8141: 0xAC27,
0x8153-0x8141: 0xAC28,
0x8154-0x8141: 0xAC29,
0x8155-0x8141: 0xAC2A,
0x8156-0x8141: 0xAC2B,
0x8157-0x8141: 0xAC2E,
0x8158-0x8141: 0xAC32,
0x8159-0x8141: 0xAC33,
0x815A-0x8141: 0xAC34,
0x8161-0x8141: 0xAC35,
0x8162-0x8141: 0xAC36,
0x8163-0x8141: 0xAC37,
...
0xFDFA-0x8141: 0x72A7,
0xFDFB-0x8141: 0x79A7,
0xFDFC-0x8141: 0x7A00,
0xFDFD-0x8141: 0x7FB2,
0xFDFE-0x8141: 0x8A70,
];
```


Re: Make shared static this() encoding table compilable

2022-03-17 Thread Patrick Schluter via Digitalmars-d-learn

On Monday, 14 March 2022 at 09:40:00 UTC, zhad3 wrote:
Hey everyone, I am in need of some help. I have written this 
Windows CP949 encoding table 
https://github.com/zhad3/zencoding/blob/main/windows949/source/zencoding/windows949/table.d which is used to convert CP949 to UTF-16.


After some research about how to initialize immutable 
associative arrays people suggested using `shared static 
this()`. So far this worked for me, but I recently discovered 
that DMD cannot compile this in release mode with optimizations.


`dub build --build=release`  or `dmd` with `-release -O` fails:

```
code  windows949
function  
zencoding.windows949.fromWindows949!(immutable(ubyte)[]).fromWindows949

code  table
function  zencoding.windows949.table._sharedStaticCtor_L29_C1
dmd failed with exit code -11.
```

I usually compile my projects using LDC where this works fine, 
but I don't want to force others to use LDC because of this one 
problem.


Hence I'd like to ask on how to change the code so that it 
compiles on DMD in release mode (with optimizations). I thought 
about having a computational algorithm instead of an encoding 
table but sadly I could not find any references in that regard. 
Apparently encoding tables seem to be the standard.


Why not use a simple static array (not an associative array). 
Where the values are indexed on `key - min(keys)`. Even with the 
holes in the keys (i.e. keys that do not have corresponding 
values) it will be smaller that the constructed associative 
array? The lookup is also faster.


Re: ldc executable crashes with this code

2022-02-04 Thread Patrick Schluter via Digitalmars-d-learn

On Thursday, 3 February 2022 at 02:01:34 UTC, forkit wrote:

On Thursday, 3 February 2022 at 01:57:12 UTC, H. S. Teoh wrote:




would be nice if the compiler told me something though :-(

i.e. "hey, dude, you really wanna to that?"


would be nice if programmers (C or D) learnt that a typecast 
means "shut up compiler I know what I do". You explicitly 
instructed the compiler to not complain.


Remove the typecast and the compiler will bring an error.

That's the reason why typecasts are to be avoided as much as 
possible.It is often a code smell.


Re: gdc or ldc for faster programs?

2022-01-31 Thread Patrick Schluter via Digitalmars-d-learn

On Tuesday, 25 January 2022 at 22:41:35 UTC, Elronnd wrote:

On Tuesday, 25 January 2022 at 22:33:37 UTC, H. S. Teoh wrote:
interesting because idivl is known to be one of the slower 
instructions, but gdc nevertheless considered it not 
worthwhile to replace it, whereas ldc seems obsessed about 
avoid idivl at all costs.


Interesting indeed.  Two remarks:

1. Actual performance cost of div depends a lot on hardware.  
IIRC on my old intel laptop it's like 40-60 cycles; on my newer 
amd chip it's more like 20; on my mac it's ~10.  GCC may be 
assuming newer hardware than llvm.  Could be worth popping on a 
-march=native -mtune=native.  Also could depend on how many 
ports can do divs; i.e. how many of them you can have running 
at a time.


2. LLVM is more aggressive wrt certain optimizations than gcc, 
by default.  Though I don't know how relevant that is at -O3.


-O3 often chooses longer code and unrollsmore agressively 
inducing higher miss rates in the instruction caches.

-O2 can beat -O3 in some cases when code size is important.


Re: How to print unicode characters (no library)?

2021-12-28 Thread Patrick Schluter via Digitalmars-d-learn

On Monday, 27 December 2021 at 07:12:24 UTC, rempas wrote:


I don't understand that. Based on your calculations, the 
results should have been different. Also how are the numbers 
fixed? Like you said the amount of bytes of each encoding is 
not always standard for every character. Even if they were 
fixed this means 2-bytes for each UTF-16 character and 4-bytes 
for each UTF-32 character so still the numbers doesn't make 
sense to me. So still the number of the "length" property 
should have been the same for every encoding or at least for 
UTF-16 and UTF-32. So are the sizes of every character fixed or 
not?




Your string is represented by 8 codepoints. The number of 
codeunits to represent them in memory depends on the encoding. D 
supports to work with 3 different encodings (in the Unicode 
standard there are more than these 3)


string  utf8s  = "Hello \n";
wstring utf16s = "Hello \n"w;
dstring utf32s = "Hello \n"d;

Here the canonical Unicode representation of your string

   H  e  l  l  o  \n
U+0048 U+0065 U+006C U+006C U+006F U+0020 U+1F602 U+000a

let's see how these 3 variable are represented in memory:

utf8s : 48 65 6C 6C 6F 20 F0 9F 98 82 0a
11 char in memory using 11 bytes

utf16s: 0048 0065 006C 006C 006F 0020 D83D DE02 000A
9 wchar in memory using 18 bytes

utf16s: 0048 0065 006C 006C 006F 0020 
0001F602 000A

8 dchar in memory using 32 bytes

As you can see, the most compact form is generally UTF-8, that's 
why it is the preferred encoding for Unicode.


UTF-16 is supported because of legacy support reason like it is 
used in the Windows API and also internally in Java.


UTF-32 has one advantage, in that it has a 1 to 1 mapping between 
codepoint and array index. In practice it is not that much of an 
advantage as codepoints and characters are disjoint concepts. 
UTF-32 uses a lot of memory for practically no benefit (when you 
read in the forum about the big auto-decode error of D it is 
linked to this).


Re: Wrong result with enum

2021-11-11 Thread Patrick Schluter via Digitalmars-d-learn

On Thursday, 11 November 2021 at 05:37:05 UTC, Salih Dincer wrote:

is this a issue, do you need to case?

```d
enum tLimit = 10_000;  // (1) true result
enum wLimit = 100_000; // (2) wrong result

void main()
{
  size_t subTest1 = tLimit;
  assert(subTest1 == tLimit);/* no error */

  size_t subTest2 = wLimit;
  assert(subTest2 == wLimit);/* no error */

  size_t gauss = (tLimit * (tLimit + 1)) / 2;
  assert(gauss == 50_005_000);   /* no error */

  gauss = (wLimit * (wLimit + 1)) / 2;
  assert(gauss == 5_000_050_000);/* failure

  // Fleeting Solution:
enum size_t limit = 100_000;
gauss = (limit * (limit + 1)) / 2;
assert(gauss == 5_000_050_000); //* no error */

} /* Small Version:

void main(){
  enum t = 10_000;
  size_t a = t * t;
  assert(a == 100_000_000);// No Error

  enum w = 100_000;
  size_t b = w * w;
  assert(b == 10_000_000_000); // Assert Failure
}
*/
```


Integer overflow. By default an enum is defined as `int` which is 
limited to 32 bit. `int.max` is 2_147_483_647 which is the 
biggest number representable with an int.


You can declare the enum to be of a bigger type `enum  : long { w 
= 100_000 };`
or you can use `std.bigint` if you don't know the maximum you 
work with or the library `std.experimental.checkedint` which 
allows to set the behaviour one wants in case of overflow.


Re: Wrong result with enum

2021-11-11 Thread Patrick Schluter via Digitalmars-d-learn

On Thursday, 11 November 2021 at 12:05:19 UTC, Tejas wrote:
On Thursday, 11 November 2021 at 09:11:37 UTC, Salih Dincer 
wrote:
On Thursday, 11 November 2021 at 06:34:16 UTC, Stanislav 
Blinov wrote:
On Thursday, 11 November 2021 at 05:37:05 UTC, Salih Dincer 
wrote:

is this a issue, do you need to case?

```d
enum tLimit = 10_000;  // (1) true result
enum wLimit = 100_000; // (2) wrong result
```


https://dlang.org/spec/enum.html#named_enums

Unless explicitly set, default type is int. 110 is 
greater than int.max.

11
```d
  enum w = 100_000;
  size_t b = w * w;
  // size_t b = 10 * 10; // ???
  assert(b == 10_000_000_000); // Assert Failure
```
The w!(int) is not greater than the b!(size_t)...


Are you on 32-bit OS? I believe `size_t` is 32 bits on 32 bit 
OS and 64 on a 64-bit OS


That's not the issue with his code. The 32 bit overflow happens 
already during the `w * w` mulitplication. The wrong result is 
then assigned to the `size_t`.


`cast(size_t)w * w` or the declaration `enum  : size_t { w = 
100_000 };` would change that.





Re: writef, compile-checked format, pointer

2021-08-09 Thread Patrick Schluter via Digitalmars-d-learn

On Monday, 9 August 2021 at 19:38:28 UTC, novice2 wrote:

format!"fmt"() and writef!"fmt"() templates
with compile-time checked format string
not accept %X for pointers,

but format() and writef() accept it

https://run.dlang.io/is/aQ05Ux
```
void main() {
import std.stdio: writefln;
int x;
writefln("%X", );  //ok
writefln!"%s"();  //ok
//writefln!"%X"();  //compile error
}
```

is this intentional?


Yes. %X is to format integers. Runtime evaluation of a format 
string does not allow for type checking. When using the template, 
the evaluation can be thorough and the types can be checked 
properly. You have 2 solutions for your problem, either a type 
cast


writefln!"%X"(cast(size_t));

or using the generic format specifier that will deduce itself the 
format to using depending in the passed type.


writefln!"%s"();




Re: Is returning void functions inside void functions a feature or an artifact?

2021-08-03 Thread Patrick Schluter via Digitalmars-d-learn

On Monday, 2 August 2021 at 14:46:36 UTC, jfondren wrote:

On Monday, 2 August 2021 at 14:31:45 UTC, Rekel wrote:

[...]


I don't know where you can find this in the docs, but what 
doesn't seem trivial about it? The type of the expression 
`print()` is void. That's the type that `doSomething` returns. 
That's the type of the expression that `doSomething` does 
return and the type of the expression following a `return` 
keyword in `doSomething`. Rather than a rule expressly 
permitting this, I would expect to find to either nothing (it's 
permitted because it makes sense) or a rule against it (it's 
expressly forbidden because it has to be to not work, because 
it makes sense).


C, C++, Rust, and Zig are all fine with this. Nim doesn't like 
it.


Wow. Just discovered that C accepts it. After 35 years of daily 
use of C, there are still things to discover.


Re: issue with static foreach

2021-07-22 Thread Patrick Schluter via Digitalmars-d-learn

On Thursday, 22 July 2021 at 03:43:44 UTC, someone wrote:

```

Now, if uncomment those two innocuous commented lines for the 
if (true == true) block:


```d

labelSwitch: switch (lstrExchangeID) {

static foreach (sstrExchangeID; gstrExchangeIDs) {

   mixin(r"case r"d, `"`, sstrExchangeID, `"`, r"d : "d);
   mixin(r"classTickerCustom"d, sstrExchangeID, r" 
lobjTicker"d, sstrExchangeID, r" = new classTickerCustom"d, 
sstrExchangeID, r"(lstrSymbolID);"d);

   mixin(r"if (true == true) {"d);
   mixin(r"pobjTickersCustom"d, sstrExchangeID, r" ~= 
lobjTicker"d, sstrExchangeID, r";"d);
   mixin(r"pobjTickersCommon ~= cast(classTickerCommon) 
lobjTicker"d, sstrExchangeID, r";"d);

   mixin(r"}"d);
   mixin(r"break labelSwitch;"d);

}

default :

   break;

}


What an unreadable mess. Sorry.

I would have done something like that:


```d
mixin(format!
`case r"%1$s"d :
   classTickerCustom%1$s  lobjTicker%1$s  = new 
classTickerCustom%1$s (lstrSymbolID);

   if (true == true) {
   pobjTickersCustom%1$s  ~= lobjTicker%1$s ;
   pobjTickersCommon ~= cast(classTickerCommon) 
lobjTicker%1$s ;

   }
   break labelSwitch;`(sstrExchangeID)
);
```

That's easier to edit imho.



Re: wanting to try a GUI toolkit: needing some advice on which one to choose

2021-06-03 Thread Patrick Schluter via Digitalmars-d-learn

On Tuesday, 1 June 2021 at 20:56:05 UTC, someone wrote:
On Tuesday, 1 June 2021 at 16:20:19 UTC, Ola Fosheim Grøstad 
wrote:



[...]


I wasn't considering/referring to content in the browser, this 
is an entirely different arena.


[...]


Thank you! I can only agree.


Re: Recommendations on avoiding range pipeline type hell

2021-05-16 Thread Patrick Schluter via Digitalmars-d-learn

On Sunday, 16 May 2021 at 09:55:31 UTC, Chris Piker wrote:

On Sunday, 16 May 2021 at 09:17:47 UTC, Jordan Wilson wrote:


Another example:
```d
auto r = [iota(1,10).map!(a => a.to!int),iota(1,10).map!(a => 
a.to!int)];

# compile error
```

Hi Jordan

Nice succinct example.  Thanks for looking at the code :)

So, honest question.  Does it strike you as odd that the exact 
same range definition is considered to be two different types?


Even in C
```
typedef struct {
int a;
} type1;
```
and
```
struct {
int a;
} type2;
```

are two different types. The compiler will give an error if you 
pass one to a function waiting for the other.


```
void fun(type1 v)
{
}

type2 x;

fun(x);  // gives error
```
See https://godbolt.org/z/eWenEW6q1


Maybe that's eminently reasonable to those with deep knowledge, 
but it seems crazy to a new D programmer.  It breaks a general 
assumption about programming when copying and pasting a 
definition yields two things that aren't the same type. (except 
in rare cases like SQL where null != null.)






On a side note, I appreciate that `.array` solves the problem, 
but I'm writing pipelines that are supposed to work on 
arbitrarily long data sets (> 1.4 TB is not uncommon).





Re: Shutdown signals

2021-05-11 Thread Patrick Schluter via Digitalmars-d-learn

On Tuesday, 11 May 2021 at 06:44:57 UTC, Tim wrote:

On Monday, 10 May 2021 at 23:55:18 UTC, Adam D. Ruppe wrote:

[...]


I don't know why I didn't find that. I was searching for the 
full name, maybe too specific? Thanks anyways, this is super 
helpful. I wish it was documented better though :(


So why use sigaction and not signal? From what I can tell 
signal is the C way of doing things


Use `sigaction()`, `signal()` has problems. See this 
stackoverflow [1] question explains the details


[1]: 
https://stackoverflow.com/questions/231912/what-is-the-difference-between-sigaction-and-signal


Re: How to delete dynamic array ?

2021-03-18 Thread Patrick Schluter via Digitalmars-d-learn
On Wednesday, 17 March 2021 at 16:20:06 UTC, Steven Schveighoffer 
wrote:


It's important to understand that [] is just a practical syntax 
for a fat pointer.


Thinking of [] just as a fancy pointer helps imho to clarify that 
the pointed to memory nature is independant of the pointer itself.


Re: Endianness - How to test code for portability

2021-03-13 Thread Patrick Schluter via Digitalmars-d-learn

On Friday, 12 March 2021 at 05:53:40 UTC, Preetpal wrote:
In the portability section of the language spec, they talk 
about endianness 
(https://dlang.org/spec/portability.html#endianness)  which 
refers "to the order in which multibyte types are stored." IMO 
if you wanted to actually be sure your code is portable across 
both big endian and little endian systems, you should actually 
run your code on both types of systems and test if there any 
issues.


The problem is that I am not aware of any big-endian systems 
that you can actually test on and if there is any D lang 
compiler support for any of these systems if they exist.


This is not an important issue to me but I was just curious to 
see if anyone actually tests for portability issues related to 
endianness by compiling their D Lang code for a big endian 
architecture and actually running it on that system.


Actual big endian systems? Not many around anymore:
- SPARC almost dead
- IBM z/system still around and not going away but a D 
implementation not very likely as it adds the other difficulty 
that it is not ASCII but EBCDIC.

- AVR32 doesn't look very vivid.
- Freescale Coldfire (as successor of 68K) also on a descending 
path

- OpenRISC superseded by RISC-V

Some CPU can do both but are generally used in little endian mode 
(ARM, Power) or also obsolete (Alpha, IA64).


While from an intellectual perspective endiannes support is a 
good thing, from a pure pragmatic vue it is a solved issue. 
Little endian won, definitely (except on the network in the 
TCP/IP headers).


Re: where is the memory corruption?

2020-12-10 Thread Patrick Schluter via Digitalmars-d-learn

On Wednesday, 9 December 2020 at 21:28:04 UTC, Paul Backus wrote:

On Wednesday, 9 December 2020 at 21:21:58 UTC, ag0aep6g wrote:


D's wchar is not C's wchar_t. D's wchar is 16 bits wide. The 
width of C's wchar_t is implementation-defined. In your case 
it's probably 32 bits.


In D, C's wchar_t is available as `core.stdc.stddef.wchar_t`.

http://dpldocs.info/experimental-docs/core.stdc.stddef.wchar_t.1.html


Don't use wchar_t in C. It has variable size depending of 
implementation. On Posix machines (Linux, BSD etc.) it's 32 bit 
wide UTF-32, on Windows it 16 bit UTF-16.





Re: Return values from auto function

2020-11-07 Thread Patrick Schluter via Digitalmars-d-learn

On Saturday, 7 November 2020 at 15:49:13 UTC, James Blachly wrote:


```
retval = i > 0 ? Success!int(i) : Failure("Sorry");
```

casting each to `Result` compiles, but is verbose:

```
return i > 0 ? cast(Result) Success!int(i) : cast(Result) 
Failure("Sorry");

```

** Could someone more knowledgeable than me explain why 
implicit conversion does not happen with the ternary op, but 
works fine with if/else? Presumably, it is because the op 
returns a single type and implicit conversion is performed 
after computing the expression's return type? If this somehow 
worked, it would make the SumType package much more ergonomic **


It's just that tenary requires the same type in both branches. It 
was already so in C.



return i > 0 ? (retval = Success!int(i)) : (retval = 
Failure("Sorry"));


should work


Re: why `top` report is not consistent with the memory freed by core.stdc.stdlib : free?

2020-11-06 Thread Patrick Schluter via Digitalmars-d-learn

On Friday, 6 November 2020 at 06:17:42 UTC, mw wrote:

Hi,

I'm trying this:

https://wiki.dlang.org/Memory_Management#Explicit_Class_Instance_Allocation

using core.stdc.stdlib : malloc and free to manually manage 
memory, I tested two scenarios:


-- malloc & free
-- malloc only

and I use Linux command `top` to check the memory used by the 
program, there is no difference in this two scenarios.


I also tried to use `new` to allocate the objects, and 
GC.free(). The memory number reported by `top` is much less 
than those reported by using core.stdc.stdlib : malloc and free.



I'm wondering why? shouldn't core.stdc.stdlib : malloc and free 
be more raw (low-level) than new & GC.free()? why `top` shows 
stdlib free() is not quite working?




stdlib free does not give memory back to the system in a process 
normally on Linux. top only shows the virtual memory granted to 
that process. When you malloc, the VIRT goes up, the RES might go 
up also but they only go down if explicitly requested.





Re: Getting Qte5 to work

2020-10-28 Thread Patrick Schluter via Digitalmars-d-learn

On Wednesday, 28 October 2020 at 06:52:35 UTC, evilrat wrote:


Just an advice, Qte5 isn't well maintained, the other 
alternatives such as 'dlangui' also seems abandoned, so 
basically the only maintained UI library here is gtk-d, but 
there was recently a nice tutorial series written about it.


DWT is also still active. The looks are a little outdated as it 
is swt-3 based but works just fine.





Re: Why was new(size_t s) { } deprecated in favor of an external allocator?

2020-10-15 Thread Patrick Schluter via Digitalmars-d-learn

On Wednesday, 14 October 2020 at 20:32:51 UTC, Max Haughton wrote:

On Wednesday, 14 October 2020 at 20:27:10 UTC, Jack wrote:

What was the reasoning behind this decision?


Andrei's std::allocator talk from a few years ago at cppcon 
covers this (amongst other things)


Yes, and what did he say?
You seriously don't expect people to search for a random talk 
from a random event from a random year?


Re: Why is BOM required to use unicode in tokens?

2020-09-18 Thread Patrick Schluter via Digitalmars-d-learn
On Wednesday, 16 September 2020 at 00:22:15 UTC, Steven 
Schveighoffer wrote:

On 9/15/20 8:10 PM, James Blachly wrote:

On 9/15/20 10:59 AM, Steven Schveighoffer wrote:

[...]


Steve: It sounds as if the spec is correct but the glyph 
(codepoint?) range is outdated. If this is the case, it would 
be a worthwhile update. Do you really think it would be 
rejected out of hand?




I don't really know the answer, as I'm not a unicode expert.

Someone should verify that the character you want to use for a 
symbol name is actually considered a letter or not. Using 
phobos to prove this is kind of self-defeating, as I'm pretty 
sure it would be in league with DMD if there is a bug.


I checked, it's not a letter. None of the math symbols are.



But if it's not a letter, then it would take more than just 
updating the range. It would be a change in the philosophy of 
what constitutes an identifier name.







Re: Generating struct .init at run time?

2020-07-02 Thread Patrick Schluter via Digitalmars-d-learn

On Thursday, 2 July 2020 at 07:51:29 UTC, Ali Çehreli wrote:
Normally, struct .init values are known at compile time. 
Unfortunately, they add to binary size:


[...]
memset() is the function you want. The initializer is an element 
generated in the data segment (or in a read only segment) that 
will be copied to the variable by a internal call to memcpy. The 
same happens in C except that the compilers are often clever and 
replace the copy by a memset().






Re: "if not" condition check (for data validation)

2020-06-18 Thread Patrick Schluter via Digitalmars-d-learn

On Thursday, 18 June 2020 at 13:58:33 UTC, Dukc wrote:

On Thursday, 18 June 2020 at 13:57:39 UTC, Dukc wrote:

if (not!(abra && cadabra)) ...

if (not(abra && cadabra)) ...


Which is a quite a complicated way to write

if (!(abra && cadabra)) ...



String interpolation

2020-05-21 Thread Patrick Schluter via Digitalmars-d-learn

https://forum.dlang.org/post/prlulfqvxrgrdzxot...@forum.dlang.org

On Tuesday, 10 November 2015 at 11:22:56 UTC, wobbles wrote:


int a = 1;
int b = 4;
writefln("The number %s is less than %s", a, b);


writeln("The number ",a, " is less than ",b);


Re: Compilation memory use

2020-05-05 Thread Patrick Schluter via Digitalmars-d-learn

On Monday, 4 May 2020 at 17:00:21 UTC, Anonymouse wrote:
TL;DR: Is there a way to tell what module or other section of a 
codebase is eating memory when compiling?


[...]


maybe with the massif tool of valgrind?


Re: How does one read file line by line / upto a specific delimeter of an MmFile?

2020-03-16 Thread Patrick Schluter via Digitalmars-d-learn

On Monday, 16 March 2020 at 13:09:08 UTC, Adnan wrote:

On Sunday, 15 March 2020 at 00:37:35 UTC, H. S. Teoh wrote:
On Sat, Mar 14, 2020 at 10:37:37PM +, Adnan via 
Digitalmars-d-learn wrote:

[...]


That's because a memory-mapped file appears directly in your 
program's memory address space as if it was an array of bytes 
(ubyte[]).  No interpretation is imposed upon the contents.  
If you want lines out of it, try casting the memory to 
const(char)[] and using std.algorithm.splitter to get a range 
of lines. For example:


auto mmfile = new MmFile("myfile.txt");
auto data = cast(const(char)[]) mmfile[];
auto lines = data.splitter("\n");
foreach (line; lines) {
...
}


T


Would it be wasteful to cast the entire content into a const 
string? Can a memory mapped file be read with a buffer?


a string is the same thing as immutable(char)[] . It would make 
no difference with the example above.


Re: CT regex in AA at compile time

2020-01-07 Thread Patrick Schluter via Digitalmars-d-learn
On Tuesday, 7 January 2020 at 15:40:58 UTC, Taylor Hillegeist 
wrote:

I'm trying to trick the following code snippet into compilation.

enum TokenType{
//Terminal
Plus,
Minus,
LPer,
RPer,
Number,
}

static auto Regexes =[
  TokenType.Plus:   ctRegex!(`^ *\+`),
  TokenType.Minus:  ctRegex!(`^ *\-`),
  TokenType.LPer:   ctRegex!(`^ *\(`),
  TokenType.RPer:   ctRegex!(`^ *\)`),
  TokenType.Number: ctRegex!(`^ *[0-9]+(.[0-9]+)?`)
];

but I can't get it to work. it says its an Error: non-constant 
expression.


I imagine this has to do with the ctRegex template or 
something. maybe there is a better way? Does anyone know?


In that specific case: why don't you use an array indexed on 
TokenType? TokenType are consecutive integrals so indexing is the 
fastest possible access method.


Re: What kind of Editor, IDE you are using and which one do you like for D language?

2019-12-30 Thread Patrick Schluter via Digitalmars-d-learn

On Monday, 30 December 2019 at 14:59:22 UTC, bachmeier wrote:

On Monday, 30 December 2019 at 06:43:03 UTC, H. S. Teoh wrote:


[...]


Another way in which the IDE is "heavy" is the amount of 
overhead for beginning/occasional users. I like that I can get 
someone started using D like this:


1. Open text editor
2. Type simple program
3. Compile by typing a few characters into a terminal/command 
prompt.


An IDE adds a crapload to the learning curve. It's terrible, 
because they need to memorize a bunch of steps when they use a 
GUI (click here -> type this thing in this box -> click here -> 
...)


Back when I was teaching intro econ courses, which are taken by 
nearly all students here, I'd sometimes be talking with 
students taking Java or C++ courses. One of the things that 
really sucked (beyond using Java for an intro programming 
class) was that they'd have to learn the IDE first. Not only 
were they hit with this as the simplest possible program:


public class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello, World");
}
}

but before they even got there, the instructor went through an 
entire lecture teaching them about the IDE. That's an effective 
way to make students think programming is a mind-numbingly 
stupid task on par with reading the phone book.


Contrast that with students opening a text editor, typing 
`print "Hello World"` and then running the program.


IDE support should obviously be made available. I think it 
would be a mistake, however, to move away from the simplicity 
of being able to open a text editor, type in a few lines, and 
then compile and run in a terminal. It's not just beginners. 
This is quite handy for those who will occasionally work with D 
code. For someone in my position (academic research), beginners 
and occasional programmers represents most of the user base.


Good point. It also trains people to not be able to work without 
IDE. I see it at work with some of the Java devs who aren't even 
able to invoke javac in a command line and setting javapath 
correctly. Why? Because IDE shielded them from these easy things. 
It has also a corrolary that they're not capable to implement 
sometimes simple protocols or file processings without resorting 
to external libraries. A little bit like people needing even and 
odd library in Javascript.


Re: What kind of Editor, IDE you are using and which one do you like for D language?

2019-12-30 Thread Patrick Schluter via Digitalmars-d-learn

On Sunday, 29 December 2019 at 14:41:46 UTC, Russel Winder wrote:
On Sat, 2019-12-28 at 22:01 +, p.shkadzko via 
Digitalmars-d-learn

wrote:
[…]
p.s. I found it quite satisfying that D does not really need 
an IDE, you will be fine even with nano.




The fundamental issue with these all battery included fancy IDE's 
(especially in Java) is that they tend to become dependencies of 
the projects themselves.


How many times have I seen in my professionnal world, projects 
that required specific versions of Eclipse with specific versions 
of extensions and libraries?
At my work we have exactly currently the problem. One developer 
wrote one of the desktop apps and now left the company. My 
colleagues of that department are now struggling to maintain the 
app as it used some specific GUI libs linked to some Eclipse 
version and they are nowhere to be found. You may object that 
it's a problem of the project management and I would agree. It 
was the management error to let the developer choose the IDE 
solution in the first place. A more classical/portable approach 
would have been preferable.


Furthermore, it is extremely annoying that these IDE change over 
time and all the fancy stuff gets stale and changed with other 
stuff that gets stale after time.
Visual Studio is one of the worst offenders in that category. 
Every 5 years it changes so much that everything learnt before 
can be thrown away.
IDE's work well for scenarios that the developers of the IDE 
thought of. Anything a little bit different requires changes that 
are either impossible to model or require intimate knowledge of 
the functionning of the IDE. Visual Studio comes to mind again of 
an example where that is horribly painful (I do not even mention 
the difficulty to even install such behemoth programs on our 
corporate laptops which are behind stupid proxies and follow 
annoying corporate policy rules).






Re: Template specialized functions creating runtime instructions?

2019-08-21 Thread Patrick Schluter via Digitalmars-d-learn

On Wednesday, 21 August 2019 at 00:11:23 UTC, ads wrote:

On Wednesday, 21 August 2019 at 00:04:37 UTC, H. S. Teoh wrote:
On Tue, Aug 20, 2019 at 11:48:04PM +, ads via 
Digitalmars-d-learn wrote: [...]
2) Deducing the string as you describe would require CTFE 
(compile-time function evaluation), which usually isn't done 
unless the result is *required* at compile-time.  The typical 
way to force this to happen is to store the result into an 
enum:


enum myStr = fizzbuzz!...(...);
writeln(myStr);

Since enums have to be known at compile-time, this forces CTFE 
evaluation of fizzbuzz, which is probably what you're looking 
for here.


T


Thank you for clearing those up. However even if I force CTFE 
(line 35), it doesn't seem to help much.


https://godbolt.org/z/MytoLF


It does.

on line 4113 you have that string

.L.str:
.asciz  
"Buzz\n49\nFizz\n47\n46\nFizzBuzz\n44\n43\nFizz\n41\nBuzz\nFizz\n38\n37\nFizz\nBuzz\n34\nFizz\n32\n31\nFizzBuzz\n29\n28\nFizz\n26\nBuzz\nFizz\n23\n22\nFizz\nBuzz\n19\nFizz\n17\n16\nFizzBuzz\n14\n13\nFizz\n11\nBuzz\nFizz\n8\n7\nFizz\nBuzz\n4\nFizz\n2\n1\n"


and all main() does is call writeln with that string

_Dmain:
pushrax
lea rsi, [rip + .L.str]
mov edi, 203
call@safe void 
std.stdio.writeln!(immutable(char)[]).writeln(immutable(char)[])@PLT

xor eax, eax
pop rcx
ret


You haven't given instruction to the linker to strip unused code 
so the functions generated by the templates are still there.


Re: How should I sort a doubly linked list the D way?

2019-08-14 Thread Patrick Schluter via Digitalmars-d-learn

On Tuesday, 13 August 2019 at 18:28:35 UTC, Ali Çehreli wrote:

On 08/13/2019 10:33 AM, Mirjam Akkersdijk wrote:
> On Tuesday, 13 August 2019 at 14:04:45 UTC, Sebastiaan Koppe
wrote:

>> Convert the nodes into an D array, sort the array with
nodes.sort!"a.x
>> < b.x" and then iterate the array and repair the next/prev
pointers.

If possible, I would go further and ditch the linked list 
altogether: Just append the nodes to an array and then sort the 
array. It has been shown in research, conference presentations, 
and in personal code to be the fasted option is most (or all) 
cases.


> doesn't the nature of the dynamic array slow it down a bit?

Default bounds checking is going to cost a tiny bit, which you 
can turn off after development with a compiler flag. (I still 
wouldn't.)


The only other option that would be faster is an array that's 
sitting on the stack, created with alloca. But it's only for 
cases where the thread will not run out of stack space and the 
result of the array is not going to be used.


> can't I define an array of fixed size, which is dependent on
the input
> of the function?

arr.length = number_of_elements;

All elements will be initialized to the element's default 
value, which happens to be null for pointers. (If we are back 
to linked list Node pointers.)


However, I wouldn't bother with setting length either as the 
cost of automatic array resizing is amortized, meaning that it 
won't hurt the O(1) algorithmic complexity in the general case. 
In the GC case that D uses, it will be even better: because if 
the GC knowns that the neighboring memory block is free, it 
will just add that to the dynamic array's capacity without 
moving elements to the new location.


Summary: Ditch the linked list and put the elements into an 
array. :)




There are mainly three reasons why arrays are nowadays faster 
than double linked lists:
- pointer chasing can difficultly be paralized and defeats 
prefetching. Each pointer load may cost the full latency to 
memory (hundreds of cycles). In a multiprocessor machine may also 
trigger a lot of coherency trafic.
- on 64 bit systems 2 pointers cost 16 bytes. If the payload is 
small, there is more memory used in the pointer than in the data.
- when looping in an array the OO machinery will be able to 
parallelize execution beyond loop limits.
- reduced allocation, i.e. allocation is done in bulk => faster 
GC for D.


It is only when there are a lot of external references to the 
payload in the list that using an array may become too unwieldy, 
i.e. if moving an element in memory requires the update of other 
pointers outside of the list.




Re: Question about ubyte x overflow, any safe way?

2019-08-05 Thread Patrick Schluter via Digitalmars-d-learn

On Monday, 5 August 2019 at 18:21:36 UTC, matheus wrote:

On Monday, 5 August 2019 at 01:41:06 UTC, Ali Çehreli wrote:

...
Two examples with foreach and ranges. The 'ubyte.max + 1' 
expression is int. The compiler casts to ubyte (because we 
typed ubyte) in the foreach and we cast to ubyte in the range:

...


Maybe it was a bad example of my part (Using for), and indeed 
using foreach would solve that specific issue, but what I'm 
really looking for if there is a flag or a way to check for 
overflow when assigning some variable.


ubyte u = 260;  // Here should be given some warning or throw 
exception.


It's ubyte, but it could be any other data type.



Yes, no question. It's checkedint that you should use. It was 
written exactly for that purpose.






Re: Is there a way to slice non-array type in @safe?

2019-07-12 Thread Patrick Schluter via Digitalmars-d-learn
On Thursday, 11 July 2019 at 19:35:50 UTC, Stefanos Baziotis 
wrote:

On Thursday, 11 July 2019 at 18:46:57 UTC, Paul Backus wrote:


Casting from one type of pointer to another and slicing a 
pointer are both @system, by design.


Yes, I'm aware, there are no pointers in the code. The pointer 
was used
here because it was the only way to solve the problem (but not 
in @safe).


What's the actual problem you're trying to solve? There may be 
a different way to do it that's @safe.


I want to make an array of bytes that has the bytes of the 
value passed.
For example, if T = int, then I want an array of 4 bytes that 
has the 4
individual bytes of `s1` let's say. For long, an array of 8 
bytes etc.
Ideally, that would work with `ref` (i.e. the bytes of where 
the ref points to).


imho this cannot be safe on 1st principle basis. You gain access 
to the machine representation of variable, which means you bypass 
the "control" the compiler has on its data. Alone the endianness 
issue is enough to have different behaviour of your program on 
different implementations. While in practice big endian is nearly 
an extinct species (, it is still enough to show why that 
operation is inherently @system and should not be considered 
@safe.
Of course, a @trusted function can be written to take care of 
that, but that's in fact exactly the case as it should be.


Re: Why are immutable array literals heap allocated?

2019-07-07 Thread Patrick Schluter via Digitalmars-d-learn

On Saturday, 6 July 2019 at 09:56:57 UTC, ag0aep6g wrote:

On 06.07.19 01:12, Patrick Schluter wrote:

On Friday, 5 July 2019 at 23:08:04 UTC, Patrick Schluter wrote:
On Thursday, 4 July 2019 at 10:56:50 UTC, Nick Treleaven 
wrote:

immutable(int[]) f() @nogc {
    return [1,2];
}

[...]


and it cannot optimize it away because it doesn't know what 
the caller want to do with it. It might in another module 
invoke it and modify it, the compiler cannot tell. auto a=f(); 
a[0]++;


f returns immutable. typeof(a) is immutable(int[]). You can't 
do a[0]++.


You're right, I shouldn't post at 1 am.


Re: Why are immutable array literals heap allocated?

2019-07-05 Thread Patrick Schluter via Digitalmars-d-learn

On Friday, 5 July 2019 at 23:08:04 UTC, Patrick Schluter wrote:

On Thursday, 4 July 2019 at 10:56:50 UTC, Nick Treleaven wrote:

immutable(int[]) f() @nogc {
return [1,2];
}

onlineapp.d(2): Error: array literal in `@nogc` function 
`onlineapp.f` may cause a GC allocation


This makes dynamic array literals unusable with @nogc, and 
adds to GC pressure for no reason. What code would break if 
dmd used only static data for [1,2]?


int[] in D is not an array but a fat pointer. When one realizes 
that then it become quite obvious why [1,2] was allocated. 
There is somewhere in the binary a static array [1,2] but as it 
is assigned to a pointer to mutable data, the compiler has no 
choice as to allocate a mutable copy of that immutable array.


and it cannot optimize it away because it doesn't know what the 
caller want to do with it. It might in another module invoke it 
and modify it, the compiler cannot tell. auto a=f(); a[0]++;


Re: Why are immutable array literals heap allocated?

2019-07-05 Thread Patrick Schluter via Digitalmars-d-learn

On Thursday, 4 July 2019 at 10:56:50 UTC, Nick Treleaven wrote:

immutable(int[]) f() @nogc {
return [1,2];
}

onlineapp.d(2): Error: array literal in `@nogc` function 
`onlineapp.f` may cause a GC allocation


This makes dynamic array literals unusable with @nogc, and adds 
to GC pressure for no reason. What code would break if dmd used 
only static data for [1,2]?


int[] in D is not an array but a fat pointer. When one realizes 
that then it become quite obvious why [1,2] was allocated. There 
is somewhere in the binary a static array [1,2] but as it is 
assigned to a pointer to mutable data, the compiler has no choice 
as to allocate a mutable copy of that immutable array.


Re: [OT] Re: 1 - 17 ms, 553 ╬╝s, and 1 hnsec

2019-05-27 Thread Patrick Schluter via Digitalmars-d-learn

On Tuesday, 21 May 2019 at 02:12:10 UTC, Les De Ridder wrote:

On Sunday, 19 May 2019 at 12:24:28 UTC, Patrick Schluter wrote:

On Saturday, 18 May 2019 at 21:05:13 UTC, Les De Ridder wrote:
On Saturday, 18 May 2019 at 20:34:33 UTC, Patrick Schluter 
wrote:
* hurrah for French keyboard which has a rarely used µ key, 
but none for Ç a frequent character of the language.





That's the lowercase ç. The uppercase Ç is not directly 
composable,


No, note that I said  and not . Using Lock> it

outputs a 'Ç' for me (at least on X11 with the French layout).


Does not work on Windows.  and it gives 9. I tested 
also on my Linux Mint box and it output lowercase ç with lock>.






There are 2 other characters that are not available on the 
french keyboard: œ and Œ. Quite annoying if you sell beef 
(bœuf) and eggs (œufs) in the towns of Œutrange or Œting.


It seems those are indeed not on the French layout at all. 
Might I
suggest using the Belgian layout? It is AZERTY too and has both 
'œ'

and 'Œ'.


No, it hasn't.
I indeed prefer the Belgian keyboard. It has more composable 
deadkey characters accents, tildas. Brackets [{]} and other 
programming characters < > | etc, are better placed than on the 
French keyboard.
Btw æ and Æ are missing also, but there it's not very important 
as there are really only very few words in French that use them 
ex-æquo, curriculum vitæ, et cætera


[OT] Re: 1 - 17 ms, 553 ╬╝s, and 1 hnsec

2019-05-19 Thread Patrick Schluter via Digitalmars-d-learn

On Saturday, 18 May 2019 at 21:05:13 UTC, Les De Ridder wrote:
On Saturday, 18 May 2019 at 20:34:33 UTC, Patrick Schluter 
wrote:
* hurrah for French keyboard which has a rarely used µ key, 
but none for Ç a frequent character of the language.





That's the lowercase ç. The uppercase Ç is not directly 
composable, annoying or to say it in French to illsutrate: "Ça 
fait chier". I use Alt+1+2+8 on Windows, but most people do not 
know these ancient OEM-437 based character codes going back to 
the orignal IBM-PC. The newer ANSI based Alt+0+1+9+9 is one 
keypress longer and I would have to learn actually the code.


There are 2 other characters that are not available on the french 
keyboard: œ and Œ. Quite annoying if you sell beef (bœuf) and 
eggs (œufs) in the towns of Œutrange or Œting.


Re: 1 - 17 ms, 553 ╬╝s, and 1 hnsec

2019-05-18 Thread Patrick Schluter via Digitalmars-d-learn

On Thursday, 16 May 2019 at 15:19:03 UTC, Alex wrote:

1 - 17 ms, 553 ╬╝s, and 1 hnsec


That's µs* for micro-seconds.

* hurrah for French keyboard which has a rarely used µ key, but 
none for Ç a frequent character of the language.




WTH!! is there any way to just get a normal u rather than some 
fancy useless asci hieroglyphic? Why don't we have a fancy M? 
and an h?


What's an hnsec anyways?





Re: Compile time mapping

2019-05-12 Thread Patrick Schluter via Digitalmars-d-learn

On Saturday, 11 May 2019 at 15:48:44 UTC, Bogdan wrote:
What would be the most straight-forward way of mapping the 
members of an enum to the members of another enum (one-to-one 
mapping) at compile time?


An example of a Initial enum that creates a derived enum using 
the same element names but applying a transformation via a 
function foo() pus adding some other enum elements in the Derived 
one not present in the Initial.

It's a little bit clumsy but works very well.
I use this at module level. This allows to have the Derived enum 
at compile time so that it can be used to declare variables or 
functions at compile time.




mixin({
  string code = "enum Derived : ulong { "~
"init = 0,";  /* We set the dummy 
init value to 0 */

  static foreach(i; __traits(allMembers, Initial)) {
code ~= i~"= foo(Initial."~i~"),";
  }
  code ~= "
ALL=  Whatever,
THING  =  42,
  return code ~ "}";
}());




Re: DMD different compiler behaviour on Linux and Windows

2019-04-25 Thread Patrick Schluter via Digitalmars-d-learn

On Thursday, 25 April 2019 at 20:18:28 UTC, Zans wrote:

import std.stdio;

void main()
{
char[] mychars;
mychars ~= 'a';
long index = 0L;
writeln(mychars[index]);
}

Why would the code above compile perfectly on Linux (Ubuntu 
16.04), however it would produce the following error on Windows 
10:


source\app.d(8,21): Error: cannot implicitly convert expression 
index of type long to uint


On both operating systems DMD version is 2.085.0.


The issue here is not Windows vs Linux but 32 bits vs 64 bits.
On 32 bits architectures size_t is defined as uint, long being 64 
bits long, conversion from long to uint is a truncating cast 
which are not allowed implicitely in D.
It is unfortunate that the D compiler on Windows is still 
delivered by default as a 32 bits binary and generating 32 bits 
code. I think the next release will start to deliver the compiler 
as 64 bits binary and generating 64 bits code.




Re: How to debug long-lived D program memory usage?

2019-04-21 Thread Patrick Schluter via Digitalmars-d-learn

On Thursday, 18 April 2019 at 12:00:10 UTC, ikod wrote:
On Wednesday, 17 April 2019 at 16:27:02 UTC, Adam D. Ruppe 
wrote:
D programs are a vital part of my home computer 
infrastructure. I run some 60 D processes at almost any 
time and have recently been running out of memory.


I usually run program under valgrind in this case. Though it 
will not help you to debug GC problems, but will cut off memory 
leaked malloc-s.


Even valgrind tool=massif ?


Re: Any easy way to extract files to memory buffer?

2019-03-19 Thread Patrick Schluter via Digitalmars-d-learn

On Monday, 18 March 2019 at 23:40:02 UTC, Michelle Long wrote:

On Monday, 18 March 2019 at 23:01:27 UTC, H. S. Teoh wrote:
On Mon, Mar 18, 2019 at 10:38:17PM +, Michelle Long via 
Digitalmars-d-learn wrote:
On Monday, 18 March 2019 at 21:14:05 UTC, Vladimir Panteleev 
wrote:
> On Monday, 18 March 2019 at 21:09:55 UTC, Michelle Long 
> wrote:
> > Trying to speed up extracting some files that I first 
> > have to extract using the command line to files then read 
> > those in...
> > 
> > Not sure what is taking so long. I imagine windows caches 
> > the extraction so maybe it is pointless?

[...]

Why not just use std.mmfile to memory-map the file into memory 
directly? Let the OS take care of actually paging in the file 
data.



T


The files are on disk and there is an external program that 
read them and converts them and then writes the converted files 
to disk then my program reads. Ideally the conversion program 
would take memory instead of disk files but it doesn't.


the file that was written by the first program will be in the 
file cache. mmap() (and the Windows equivalent of that) syscalls 
are at the core only giving access to the OS file cache. This 
means that std.mmfile is the way to go. There will be no 
reloading from disk if the file sizes are within reason.


Re: Should D file end with newline?

2019-02-15 Thread Patrick Schluter via Digitalmars-d-learn

On Wednesday, 13 February 2019 at 05:13:12 UTC, sarn wrote:
On Tuesday, 12 February 2019 at 20:03:09 UTC, Jonathan M Davis 
wrote:

So, I'd say that it's safe to say that dmd
The whole thing just seems like a weird requirement that 
really shouldn't be there,


Like I said in the first reply, FWIW, it's a POSIX requirement.

Turns out most tools don't care (and dmd is apparently one of 
them).  If you want an easy counterexample, try the wc command 
(it miscounts lines for non-compliant files).  I've never seen 
that break an actual build system, which is why I said you 
could mostly get away with it.  On the other hand, being 
POSIX-compliant always works.


it matters even less if text editors are automatically 
appending newlines to files if they aren't there whether they 
show them or not, since if that's the case, you'd have to 
really work at it to have files not ending with newlines 
anyway.


There are definitely broken text editors out there that won't 
add the newline (can't think of names).  Like Jacob Carlborg 
said, Github flags the files they generate.


hexdump shows a newline followed by a null character followed 
by a newline after the carriage return.


hexdump is printing little-endian 16b by default, so I think 
that's just two newlines followed by a padding byte from 
hexdump.
 Try using the -c or -b flag and you probably won't see any 
null byte.


Curiously, if I create a .cpp or .c file with vim and have it 
end with a curly brace, vim _does_ append a newline followed 
by a null character followed by a newline at the end of the 
file. So, I guess that vim looks at the extension and realizes 
that C/C++ has such a requirement and takes care of it for 
you, but it does not think that .d files need them and adds 
nothing extra for them. It doesn't add anything for a .txt 
file when I tried it either.


Are you sure?  vim is supposed to add the newline for all text 
files because that's POSIX.  It does on my (GNU/Linux) machine.


A lots of fgets() based tools on Unix systems fail to read the 
last line if it doesn't contain a line feed character at the end. 
Afaicr glibc implementation does not have that problem but a lot 
of other standard C libs do.
When we were still on Solaris we had to be very careful with 
that, as strange things could happen when using sed, awk, wc and 
a lot of other standard Unix commands.
Now that we have switched to Linux we don't have the issue 
anymore.


Re: Compiling to 68K processor (Maybe GDC?)

2019-01-20 Thread Patrick Schluter via Digitalmars-d-learn
On Sunday, 20 January 2019 at 09:27:33 UTC, Jonathan M Davis 
wrote:
On Saturday, January 19, 2019 10:45:41 AM MST Patrick Schluter 
via Digitalmars-d-learn wrote:

On Saturday, 19 January 2019 at 12:54:28 UTC, rikki cattermole

wrote:
> [...]

At least 68030 (or 68020+68851) would be necessary for proper 
segfault managing (MMU) and an OS that uses it. Afaict NULL 
pointer derefernecing must fault for D to be "usable". At 
least all code is written with that assumption.


For @safe to work properly, dereferencing null must be @safe, 
which means more or less means that either it results in a 
segfault, or the compiler has to add additional checks to 
ensure that null isn't dereferenced. The situation does get a 
bit more complicated in the details (e.g. calling a non-virtual 
member function on a null pointer or reference wouldn't 
segfault if the object's members are never actually accessed, 
and that's fine, because it doesn't violate @safe), but in 
general, either a segfault must occur, or the compiler has to 
add extra checks so that invalid memory is not accessed. At 
this point, AFAIK, all of the D compilers assume that 
dereferencing null will segfault, and they don't ever add 
additional checks. If an architecture does not segfault when 
dereferencing null, then it will need special handling by the 
compiler, and I don't think that ever happens right now. So, if 
D were compiled on such an architecture, @safe wouldn't provide 
the full guarantees that it's supposed to.




Ok, thanks for the explanation. This said, my statement that a 
PMMU is required for NULL pointer segfaults is wrong. Even 68000 
can segfault on NULL dereference in user mode at least (the 
famous bus error 2 bombs on Atari ST or guru meditations on 
Amiga). In priviledged mode though it's not the case as there is 
memory at address 0 (reset vector) that might be necessary to 
access by an OS.




Re: Compiling to 68K processor (Maybe GDC?)

2019-01-19 Thread Patrick Schluter via Digitalmars-d-learn
On Saturday, 19 January 2019 at 12:54:28 UTC, rikki cattermole 
wrote:

On 20/01/2019 1:38 AM, Edgar Vivar wrote:

Hi,

I have a project aiming to old 68K processor. While I don't 
think DMD would be able for this on the other hand I think GDC 
can, am I right?


If yes would be any restriction of features to be used? Or the 
compiler would be smart enough to handle this properly?


Edgar V.


Potentially.

D is designed to only work on 32bit+ architectures. The 68k 
series did have 32bit versions of them.


After a quick check it does look like LDC is out as LLVM has 
not yet got support for M68k target. Which is unfortunate 
because with the -betterC flag it could have pretty much out of 
the box worked. Even if you don't have most of D at your 
disposal e.g. classes and GC (but hey old cpu! can't expect 
that).


I have no idea about GDC, but the -betterC flag is pretty 
recent so its support may not be what you would consider first 
class there yet.


At least 68030 (or 68020+68851) would be necessary for proper 
segfault managing (MMU) and an OS that uses it. Afaict NULL 
pointer derefernecing must fault for D to be "usable". At least 
all code is written with that assumption.


Re: Bitwise rotate of integral

2019-01-08 Thread Patrick Schluter via Digitalmars-d-learn

On Tuesday, 8 January 2019 at 12:35:16 UTC, H. S. Teoh wrote:
On Tue, Jan 08, 2019 at 09:15:09AM +, Patrick Schluter via 
Digitalmars-d-learn wrote:

On Monday, 7 January 2019 at 23:20:57 UTC, H. S. Teoh wrote:

[...]

> [...]
Are you sure it's dmd looking for the pattern. Playing with 
the godbolt link shows that dmd doesn't generate the rol code 
(gdc 4.8.2 neither).


I vaguely remember a bug about this. There is definitely 
explicit checking for this in dmd; I don't remember if it was a 
bug in the pattern matching code itself, or some other problem, 
that made it fail. You may need to specify -O for the code to 
actually be active. Walter could point you to the actual 
function that does this optimization.



I did use the -O flag. The code generated did not use rol.


Re: signed nibble

2019-01-08 Thread Patrick Schluter via Digitalmars-d-learn
On Tuesday, 8 January 2019 at 10:32:25 UTC, Ola Fosheim Grøstad 
wrote:
On Tuesday, 8 January 2019 at 09:30:14 UTC, Patrick Schluter 
wrote:

[...]


Heh, I remember they had a friday-night trivia contest at the 
mid-90s students pub (for natural sciences) where one of the 
questions was the opcode for 6502 LDA (or was it NOP?), and I 
believe I got it right. The opcode for NOP is burned into my 
memory as $EA was used for erasing code during debugging in a 
monitor. And it was also the letters for the big game company 
Electronic Arts...


The cycle counts for 6502 are pretty easy though as they tend 
to be related to the addressing mode and most of them are in 
the range 1-5... No instruction for multiplication or 
division... Oh the fun...


2-7 cycles ;-)


Re: signed nibble

2019-01-08 Thread Patrick Schluter via Digitalmars-d-learn

On Monday, 7 January 2019 at 21:46:21 UTC, H. S. Teoh wrote:
On Mon, Jan 07, 2019 at 08:41:32PM +, Patrick Schluter via 
Digitalmars-d-learn wrote:

On Monday, 7 January 2019 at 20:28:21 UTC, H. S. Teoh wrote:
> On Mon, Jan 07, 2019 at 08:06:17PM +0000, Patrick Schluter 
> via Digitalmars-d-learn wrote:

[...]
> > Up to 32 bit processors, shifting was more expensive than 
> > branching.
> 
> Really?  Haha, never knew that, even though I date all the 
> way back to writing assembly on 8-bit processors. :-D
> 
Most of my career was programming for 80186. Shifting by one 
was 2 cycles in register and 15 in memory. Shifting by 4, 9 
cycles for regs/21 for mem. And 80186 was a fast shifter 
compared to 8088/86 or 68000 (8+2n cycles).


I used to hack 6502 assembly code.


Yeah, that's also what I started with, on the Apple II in the 
early 80s. I was quite surprized that my 6502 knowledge came in 
very handy when we worked on dial-in modems in the late 90s as 
the Rockwell modems all used 6502 derived micro-controllers for 
them.


During the PC revolution I wrote an entire application in 8088 
assembly.  Used to know many of the opcodes and cycle counts by 
heart like you do, but it's all but a faint memory now.


I had to lookup the exact cycle counts ;-) . I remember the 
relative costs, more or less, but not the details anymore.




Re: Bitwise rotate of integral

2019-01-08 Thread Patrick Schluter via Digitalmars-d-learn

On Monday, 7 January 2019 at 23:20:57 UTC, H. S. Teoh wrote:
On Mon, Jan 07, 2019 at 11:13:37PM +, Guillaume Piolat via 
Digitalmars-d-learn wrote:

On Monday, 7 January 2019 at 14:39:07 UTC, Per Nordlöw wrote:
> What's the preferred way of doing bitwise rotate of an 
> integral value in D?
> 
> Are there intrinsics for bitwise rotation available in LDC?


Turns out you don't need any:

https://d.godbolt.org/z/C_Sk_-

Generates ROL instruction.


There's a certain pattern that dmd looks for, that it 
transforms into a ROL instruction. Similarly for ROR.  Deviate 
too far from this pattern, though, and it might not recognize 
it as it should.  To be sure, always check the disassembly.


Are you sure it's dmd looking for the pattern. Playing with the 
godbolt link shows that dmd doesn't generate the rol code (gdc 
4.8.2 neither).




Re: signed nibble

2019-01-07 Thread Patrick Schluter via Digitalmars-d-learn

On Monday, 7 January 2019 at 20:28:21 UTC, H. S. Teoh wrote:
On Mon, Jan 07, 2019 at 08:06:17PM +, Patrick Schluter via 
Digitalmars-d-learn wrote:

On Monday, 7 January 2019 at 18:56:17 UTC, H. S. Teoh wrote:
> On Mon, Jan 07, 2019 at 06:42:13PM +0000, Patrick Schluter 
> via Digitalmars-d-learn wrote:

[...]

> > byte b = nibble | ((nibble & 0x40)?0xF0:0);
> 
> This is equivalent to doing a bit comparison (implied by the 
> ? operator).  You can do it without a branch:
> 
> 	cast(byte)(nibble << 4) >> 4
> 
> will use the natural sign extension of a (signed) byte to 
> "stretch" the upper bit.  It just takes 2-3 CPU instructions.
> 

Yeah, my bit-fiddle-fu goes back to pre-barrel-shifter days. 
Up to 32 bit processors, shifting was more expensive than 
branching.


Really?  Haha, never knew that, even though I date all the way 
back to writing assembly on 8-bit processors. :-D


Most of my career was programming for 80186. Shifting by one was 
2 cycles in register and 15 in memory. Shifting by 4, 9 cycles 
for regs/21 for mem. And 80186 was a fast shifter compared to 
8088/86 or 68000 (8+2n cycles).




Re: signed nibble

2019-01-07 Thread Patrick Schluter via Digitalmars-d-learn

On Monday, 7 January 2019 at 18:56:17 UTC, H. S. Teoh wrote:
On Mon, Jan 07, 2019 at 06:42:13PM +, Patrick Schluter via 
Digitalmars-d-learn wrote:

On Monday, 7 January 2019 at 17:23:19 UTC, Michelle Long wrote:
> Is there any direct way to convert a signed nibble in to a 
> signed byte with the same absolute value? Obviously I can do 
> some bit comparisons but just curious if there is a very 
> quick way.


byte b = nibble | ((nibble & 0x40)?0xF0:0);


This is equivalent to doing a bit comparison (implied by the ? 
operator).  You can do it without a branch:


cast(byte)(nibble << 4) >> 4

will use the natural sign extension of a (signed) byte to 
"stretch" the upper bit.  It just takes 2-3 CPU instructions.




Yeah, my bit-fiddle-fu goes back to pre-barrel-shifter days. Up 
to 32 bit processors, shifting was more expensive than branching.




Re: signed nibble

2019-01-07 Thread Patrick Schluter via Digitalmars-d-learn

On Monday, 7 January 2019 at 18:47:04 UTC, Adam D. Ruppe wrote:
On Monday, 7 January 2019 at 18:42:13 UTC, Patrick Schluter 
wrote:

byte b = nibble | ((nibble & 0x40)?0xF0:0);


don't you mean & 0x80 ?


He asked for signed nybble. So mine is wrong and yours also :-)

It's obviously 0x08 for the highest bit of the low nybble.

byte b = nibble | ((nibble & 0x08)?0xF0:0);


Re: signed nibble

2019-01-07 Thread Patrick Schluter via Digitalmars-d-learn

On Monday, 7 January 2019 at 17:23:19 UTC, Michelle Long wrote:
Is there any direct way to convert a signed nibble in to a 
signed byte with the same absolute value? Obviously I can do 
some bit comparisons but just curious if there is a very quick 
way.


byte b = nibble | ((nibble & 0x40)?0xF0:0);


Re: Bug in shifting

2018-12-19 Thread Patrick Schluter via Digitalmars-d-learn
On Tuesday, 18 December 2018 at 20:33:43 UTC, Rainer Schuetze 
wrote:



On 14/12/2018 02:56, Steven Schveighoffer wrote:

On 12/13/18 7:16 PM, Michelle Long wrote:

byte x = 0xF;
ulong y = x >> 60;


Surely you meant x << 60? As x >> 60 is going to be 0, even 
with a ulong.


It doesn't work as intuitive as you'd expect:

void main()
{
int x = 256;
int y = 36;
int z = x >> y;
writeln(z);
}

prints "16" without optimizations and "0" with optimizations. 
This happens for x86 architecture because the processor just 
uses the lower bits of the shift count. It is probably the 
reason why the language disallows shifting by more bits than 
the size of the operand.


Yes. On x86 shifting (x >> y) is in reality x >> (y & 0x1F) on 32 
bits and x >> (y & 0x3F) on 64 bits.


Re: Why does nobody seem to think that `null` is a serious problem in D?

2018-11-21 Thread Patrick Schluter via Digitalmars-d-learn

On Tuesday, 20 November 2018 at 23:14:27 UTC, Johan Engelen wrote:
On Tuesday, 20 November 2018 at 19:11:46 UTC, Steven 
Schveighoffer wrote:

On 11/20/18 1:04 PM, Johan Engelen wrote:


D does not make dereferencing on class objects explicit, 
which makes it harder to see where the dereference is 
happening.


Again, the terms are confusing. You just said the dereference 
happens at a.foo(), right? I would consider the dereference to 
happen when the object's data is used. i.e. when you read or 
write what the pointer points at.


But `a.foo()` is already using the object's data: it is 
accessing a function of the object and calling it. Whether it 
is a virtual function, or a final function, that shouldn't 
matter.


It matters a lot. A virtual function is a pointer that is in the 
instance, so there is a derefernce of the this pointer to get the 
address of the function.
For a final function, the address of the function is known at 
compile time and no dereferencing is necessary.


That is a thing that a lot of people do not get, a member 
function and a plain  function are basically the same thing. What 
distinguishes them, is their mangled name. You can call a non 
virtual member function from an assembly source if you know the 
symbol name.
UFCS uses this fact, that member function and plain function are 
indistinguishable in a object code point of view, to fake member 
functions.



There are different ways of implementing class function calls, 
but here often people seem to pin things down to one specific 
way. I feel I stand alone in the D community in treating the 
language in this abstract sense (like C and C++ do, other 
languages I don't know). It's similar to that people think that 
local variables and the function return address are put on a 
stack; even though that is just an implementation detail that 
is free to be changed (and does often change: local variables 
are regularly _not_ stored on the stack [*]).


Optimization isn't allowed to change behavior of a program, yet 
already simple dead-code-elimination would when null 
dereference is not treated as UB or when it is not guarded by a 
null check. Here is an example of code that also does what you 
call a "dereference" (read object data member):

```
class A {
int i;
final void foo() {
int a = i; // no crash with -O
}
}

void main() {
A a;
a.foo();  // dereference happens
}


No. There's no dereferencing. foo does nothing visible and can be 
replaced by a NOP. For the call, no dereferencing required.



```

When you don't call `a.foo()` a dereference, you basically say


Again, no dereferencing for a (final) function call. `a.foo()` is 
the same thing as `foo(a)` by reverse UFCS. The generated code is 
identical. It is only the compiler that will use different 
mangled names.


that `this` is allowed to be `null` inside a class member 
function. (and then it'd have to be normal to do `if (this) 
...` inside class member functions...)


These discussions are hard to do on a mailinglist, so I'll stop 
here. Until next time at DConf, I suppose... ;-)


-Johan

[*] intentionally didn't say where those local variables _are_ 
stored, so that people can solve that little puzzle for 
themselves ;-)





Re: Why is stdio ... stdio?

2018-11-10 Thread Patrick Schluter via Digitalmars-d-learn

On Saturday, 10 November 2018 at 18:47:19 UTC, Chris Katko wrote:

On Saturday, 10 November 2018 at 13:53:14 UTC, Kagamin wrote:

[...]


There is another possibility. Have the website run (fallible) 
heuristics to detect a snippet of code and automatically 
generate it. That would leave the mailing list people 
completely unchanged.


[...]


Simply using markup convention used in stackoverflow and reddit 
of formatting as code when indented by 4 blanks would already be 
a good step forward. I do it now even on newsgroup like 
comp.lang.c, the only newsgroup I still use via thunderbird 
(yeah, for the D groups I prefer the web interface which is 
really that good, contrary to all other web based newsgroup 
reader I ever saw).





[...]




Re: Converting a character to upper case in string

2018-09-22 Thread Patrick Schluter via Digitalmars-d-learn
On Saturday, 22 September 2018 at 06:01:20 UTC, Vladimir 
Panteleev wrote:

On Friday, 21 September 2018 at 12:15:52 UTC, NX wrote:
How can I properly convert a character, say, first one to 
upper case in a unicode correct manner?


That would depend on how you'd define correctness. If your 
application needs to support "all" languages, then (depending 
how you interpret it) the task may not be meaningful, as some 
languages don't have the notion of "upper-case" or even 
"character" (as an individual glyph). Some languages do have 
those notions, but they serve a specific purpose that doesn't 
align with the one in English (e.g. Lojban).


There are other traps in the question of uppercase/lowercase 
which makes is indeed very difficult to handle correctly if we 
don't define what correctly means.

Examples:
- It may be necessary to know the locale, i.e. the language of 
the string to uppercase. In Turkish uppercase of i is not I but İ 
and lowercase of I is ı (that was a reason for the calamitous low 
performance of toUpper/toLower in Java for example.
- Some uppercases depend on what they are used for. German ß 
shouldbe uppercased as SS (note also btw that 1 codepoint becomes 
2 in uppercase) in normal text, but for calligraphic work, road 
signs and other usages it can be capital ẞ.
- Greek has 2 lowercase forms for Σ but two lowercase forms σ and 
ς depending on the word position.
- While it becomes less and less relevant Serbo-croatian may use 
digraphs when transcoding the script from Cyrillic (Serbian) to 
Latin (Croatian), these digraphs have 2 uppercase forms 
(title-case and all capital):

  - dž -> DŽ or Dž
  - lj -> LJ or Lj
  - NJ -> Nj or nj
Normalization would normally take care of that case.
- Some languages may modify or remove diacritical signs when 
uppercasing. It is quite usual in French to not put accents on 
capitals.


It is also clear that the operation of uppercasing is not 
symetric with lowercasing.




In which code level I should be working on? Grapheme? Or maybe 
code point is sufficient?


Using graphemes is necessary if you need to support e.g. 
combining marks (e.g. ̏◌ + S = ̏S).





Re: First run after build on Windows is slow

2018-06-26 Thread Patrick Schluter via Digitalmars-d-learn

On Tuesday, 26 June 2018 at 12:58:29 UTC, Adam D. Ruppe wrote:

On Tuesday, 26 June 2018 at 12:40:05 UTC, phs wrote:
Although, it's a little bit strange because I have never had 
this issue with my C++ development.


The c++ compiler and runtime libraries are common enough that 
the antivirus realizes it is nothing special, but since D is 
more obscure it just flags it as abnormal and does further 
checks. (more obscure C++ compilers can do the same thing).


Walter has written Microsoft and they have OK'd it before, but 
then one side or the other updates and then they are out of 
whack again :(


Yes, the anti-virus situation is really annoying on windows. I 
couldn't install dmd 2.080 on the work PC as the AV always 
quarantained some of the installed tools which made the install 
routine fail. The blame lies on our work PC's which are 
incredibly misconfigured and limited (proxy that Microsoft VS 
installer can not overcome even if any other program can, extra 
small system SSD, eager anti-virus that cannot be instructed of 
anything, profiles with read/write desktop but no delete (this 
means that a shortcut can be copied on the desktop but can never 
be removed), etc. etc.)
Just had to vent my frustration. I love D but the elements are 
against me to be able to enjoy it to the fullest...


Re: OT: Parsing object files for fun and profit

2018-05-31 Thread Patrick Schluter via Digitalmars-d-learn

On Thursday, 31 May 2018 at 18:33:37 UTC, Ali Çehreli wrote:

On 05/31/2018 09:49 AM, Adam D. Ruppe wrote:

> Should be fairly simple to follow, just realize that the
image is a 2d
> block for each char and that's why there's all those
multiplies and
> divides.

I remember doing similar things with fonts to add Turkish 
characters to Digital Research and Wordperfect products (around 
1989-1992). Localizing by patching compiled code was fun. :)


My happiest accomplishment was localizing Ventura Publisher 
"cleanly" after realizing that their language-related 
"resource" file was just an object file compiled from a simple 
C source code which had just an array of strings in it:


char * texts[] = {
"yes",
"no",
// ...
};

I parsed the object file to generate C source code, translated 
the C source code, and finally compiled it again. Voila! 
Ventura Publisher in Turkish, and everything lined-up 
perfectly. :) Before that, one had to patch the object file to 
abbreviate "evet" on top of "yes", "hayır" on top of "no", etc.


Ali


Look for bdf files. This is an quite old X-windows bitmap font 
file format that has the big advantage of being a simple text 
format. So it is easy to parse and transform. There are quite 
some fonts existing in that format.


https://en.wikipedia.org/wiki/Glyph_Bitmap_Distribution_Format

I had used it in an embedded project in the 90s and it was simple 
enough that a 80186 based terminal could handle bitmap 
proportional fonts without breaking a sweat.


Re: Compile time initialization of AA

2018-03-24 Thread Patrick Schluter via Digitalmars-d-learn

On Friday, 23 March 2018 at 22:43:47 UTC, Xavier Bigand wrote:
I am trying to initialize an global immutable associative array 
of structs, but it doesn't compile.
I am getting the following error message : "Error: not an 
associative array initializer".


As I really need to store my data for a compile time purpose if 
we can't do that with AA, I'll use arrays instead.


Here is my code :
struct EntryPoint
{
string  moduleName;
string  functionName;
boolbeforeForwarding = false;
}

immutable EntryPoint[string]  entryPoints = [
"wglDescribePixelFormat": 
{moduleName:"opengl32.forward_initialization", 
functionName:"wglDescribePixelFormat"}

];


Another solution, radically different is to not use an AA but a 
simple array.
Instead of indexing on a string you could simply index on an enum 
type. As your array is compile time constant, the dynamic nature 
of AA is not really used and indexing on a enum value is faster 
and simpler anyway.
The trick here is to generate the enum type at compile time. This 
can be achieved by a string mixin built at compile time.


Here an example that I used to generate an enum from the values 
from another enum. In your case you can look where the 
"wglDescribePixelFormat" are defined and using imports or string 
building code.


mixin({
  string code = "enum LANBIT : ulong { "~
"init = 0,";  /* We set the dummy 
init value to 0 */

  foreach(lanCode; __traits(allMembers, LANIDX)) {
static if(lanCode == "IN")
  code ~= "INVALID = LANIDX2LANID(LANIDX."~lanCode~"),";
else
  code ~= lanCode~"= LANIDX2LANID(LANIDX."~lanCode~"),";
  }
  code ~= "
ALL=  INVALID-1, /**< All bits except 
the LANID_INVALID */
OFFICIAL= 
BG|CS|DA|DE|EL|EN|ES|ET|FI|FR|GA|HR|HU|IT|LT|LV|MT|NL|PL|PT|RO|SK|SL|SV,  /**< Official languages of the EU */
COOFFICIAL=CA|GL|EU|GD|CY /**< Co-official 
languages of the EU */

  ";
  return code ~ "}";
}());


TL/DR
defining constant compile time AA is an oxymoron. AA are by 
nature dynamic runtime creatures. if the indexes are compile 
time, a normal array with fixed indexes is enough.




Re: sys_write in betterC doesn't write anything

2018-02-03 Thread Patrick Schluter via Digitalmars-d-learn

On Saturday, 3 February 2018 at 15:38:19 UTC, Basile B. wrote:

On Saturday, 3 February 2018 at 15:30:10 UTC, Basile B. wrote:

[...]


okay solved:



module runnable;

__gshared static msg = "betterC\n";
__gshared static len = 8;

extern(C) int main(int argc, char** args)
{
asm
{
naked;
mov EDX, len ;//message length
mov ECX, [msg + 4] ;//message to write
mov EBX, 1   ;//file descriptor (stdout)
mov EAX, 4   ;//system call number (sys_write)
int 0x80 ;//call kernel

mov EBX, 0   ;//process' exit code
mov EAX, 1   ;//system call number (sys_exit)
int 0x80 ;//call kernel - this interrupt won't 
return

}
}

the pointer to the string data is 4 bytes later...


[msg] contains the length of the string, so there's no need of 
your len variable. Just saying.


Re: How to proceed with learning to code Windows desktop applications?

2018-01-29 Thread Patrick Schluter via Digitalmars-d-learn
On Tuesday, 30 January 2018 at 06:25:52 UTC, rikki cattermole 
wrote:

On 30/01/2018 5:47 AM, thedeemon wrote:
On Tuesday, 30 January 2018 at 03:07:38 UTC, rikki cattermole 
wrote:


But since Windows is the only platform mentioned or desired 
for, everything you need is in WinAPI!


It's like saying "everything you need is assembly language" 
when talking about languages and compilers. Pure WinAPI is a 
cruel advice for a novice.




There are libraries such as[0], so it isn't cruel, but it is 
something worth while at least to look into for someone who 
might be interested in it, but doesn't know where to begin.


[0] https://bitbucket.org/dgui/dgui


There's also DWT which has the advantage of being portable.


Re: String Type Usage. String vs DString vs WString

2018-01-15 Thread Patrick Schluter via Digitalmars-d-learn
On Monday, 15 January 2018 at 04:27:15 UTC, Jonathan M Davis 
wrote:
On Monday, January 15, 2018 03:14:02 Tony via 
Digitalmars-d-learn wrote:

On Monday, 15 January 2018 at 02:09:25 UTC, rikki cattermole

wrote:
> Unicode has three main variants, UTF-8, UTF-16 and UTF-32. 
> The size of a code point is 1, 2 or 4 bytes.


I think to be technically correct, 1 (UTF-8), 2 (UTF-16) or 4
(UTF-32) bytes are referred to as "code units" and the size of 
a

code point varies in UTF-8 and UTF-16.


Yes, for UTF-8, a code unit is 8 bits, and there can be up to 6 
of them (IIRC) in a code point.


Nooo!!! Only 4 maximum for Unicode. Beyond that it's 
obsolete crap that is not Unicode since version 2 of Unicode.






Re: Consequences of casting away immutable from pointers

2018-01-05 Thread Patrick Schluter via Digitalmars-d-learn

On Friday, 5 January 2018 at 18:13:11 UTC, H. S. Teoh wrote:
On Fri, Jan 05, 2018 at 05:50:34PM +, jmh530 via 
Digitalmars-d-learn wrote:


Be careful with that:

class C { int x; }
immutable C c = new C(5);
auto i = c.x;

C y = cast(C) c;
y.x = 10;
i = c.x; // <-- compiler may assume c.x is still 5

Since c.x is read from an immutable object, the compiler may 
assume that its value hasn't changed the second time you access 
it, so it may just elide the second assignment to i completely, 
thereby introducing a bug into the code.


Basically, casting away immutable is UB, and playing with UB is 
playing with fire. :-P




And these things are nasty. We had one in our C project last 
month that had us tear our hair out. It was in the end a 
documentation problem of gcc that induced  the misunderstanding 
of the purpose of __attribut__((malloc)) and its effect on 
aliased pointer.




Re: Efficient way to pass struct as parameter

2018-01-03 Thread Patrick Schluter via Digitalmars-d-learn

On Tuesday, 2 January 2018 at 23:27:22 UTC, H. S. Teoh wrote:


When it comes to optimization, there are 3 rules: profile, 
profile, profile.  I used to heavily hand-"optimize" my code a 
lot (I come from a strong C/C++ background -- premature 
optimization seems to be a common malady among us in that 
crowd).


That's why I always tell that C++ is premature optimization 
oriented programming, aka as POOP.


Re: std.file and non-English filename in Windows

2018-01-01 Thread Patrick Schluter via Digitalmars-d-learn

On Sunday, 31 December 2017 at 18:21:29 UTC, Domain wrote:
In Windows, exists, rename, copy will report file not exists 
when you input non-English filename, such as Chinese 中文.txt


It's unclear what your problem is but here a wild guess.

Windows API's for Unicode use UTF-16 as far as I know. Strings in 
D are utf-8. So before calling win32 API function, they have to 
be transformed to wstring i.e. utf-16 strings.




Re: why ushort alias casted to int?

2017-12-22 Thread Patrick Schluter via Digitalmars-d-learn

On Friday, 22 December 2017 at 10:14:48 UTC, crimaniak wrote:

My code:

alias MemSize = ushort;

struct MemRegion
{
MemSize start;
MemSize length;
@property MemSize end() const { return start+length; }
}

Error: cannot implicitly convert expression 
`cast(int)this.start + cast(int)this.length` of type `int` to 
`ushort`


Both operands are the same type, so as I understand casting to 
longest type is not needed at all, and longest type here is 
ushort in any case. What am I doing wrong?


@property MemSize end() const { return cast(int)(start+length); }

The rule of int promotion of smaller types comes from C as they 
said. There are 2 reason to do it that way. int is supposed in C 
to be the natural arithmetic type of the CPU it runs on, i.e. the 
default size the processor has the least difficulties to handle. 
The second reason is that it allows to detect easily without much 
hassle if the result of the operation is in range or not. When 
doing arithmetic with small integer types, it is easy that the 
result overflows. It is not that easy to define portably this 
overflow behaviour. On some cpus it would require extra 
instructions. D has inherited this behaviour so that copied 
arithmetic code coming from C behaves in the same way.


@property MemSize end() const
{
  MemSize result = start+length;
  assert(result <= MemSize.max);
  return cast(int)result;
}

with overflow arithmetic this code is not possible (as is if 
MemSize was uint or ulong).


Re: Sort characters in string

2017-12-07 Thread Patrick Schluter via Digitalmars-d-learn
On Wednesday, 6 December 2017 at 15:12:22 UTC, Steven 
Schveighoffer wrote:

On 12/6/17 4:34 AM, Ola Fosheim Grøstad wrote:
On Wednesday, 6 December 2017 at 09:24:33 UTC, Jonathan M 
Davis wrote:
UTF-32 on the other hand is guaranteed to have a code unit be 
a full code point.


I don't think the standard says that? Isn't this only because 
the current set is small enough to fit? So this may change as 
Unicode grows?





The current unicode encoding has 2 million different code 
points.


2,097,152 possible codepoints. As of [Unicode 10] only 136,690 
codepoints have been assigned.



I'd say we'll all be dead and so will our great great
great grandchildren by the time unicode amasses more than 2 
billion codepoints :)


So there's enough time even before the current range is even 
filled.




Also, UTF8 has been standardized to only have up to 4 code 
units per code point. The encoding scheme allows more, but the 
standard restricts it.




[Unicode 10]: http://www.unicode.org/versions/Unicode10.0.0/


Re: Sort characters in string

2017-12-07 Thread Patrick Schluter via Digitalmars-d-learn
On Wednesday, 6 December 2017 at 09:34:48 UTC, Ola Fosheim 
Grøstad wrote:
On Wednesday, 6 December 2017 at 09:24:33 UTC, Jonathan M Davis 
wrote:
UTF-32 on the other hand is guaranteed to have a code unit be 
a full code point.


I don't think the standard says that? Isn't this only because 
the current set is small enough to fit? So this may change as 
Unicode grows?


No. Unicode uses only 21 bits and it is very unlikely to change 
anytime soon as barely 17 are really used. This means the current 
range can be grown by more than 16 times what it is now. So 
definitely, one UTF-32 codeunit is guaranted to hold any 
codepoint, forever.





Re: Sort characters in string

2017-12-07 Thread Patrick Schluter via Digitalmars-d-learn
On Wednesday, 6 December 2017 at 09:24:33 UTC, Jonathan M Davis 
wrote:
a full code point (IIRC, 1 - 6 code units for UTF-8 and 1 - 2 
for UTF-16),


YDNRC, 1 - 4 code units for UTF-8. Unicode is defined only up to 
U+10. Everything above is illegal.


Re: git workflow for D

2017-12-04 Thread Patrick Schluter via Digitalmars-d-learn
On Monday, 4 December 2017 at 11:51:42 UTC, Nick Sabalausky 
(Abscissa) wrote:

On 12/03/2017 03:05 PM, bitwise wrote:

One thing to keep in mind: Any time you're talking about moving 
anything from one repo to another, there's exactly two basic 
primitives there: push and pull. Both of them are basically the 
same simple thing: All they're about is copying the latest new 
commits (or tags) from WW branch on XX repo, to YY branch on ZZ 
repo. All other git commands that move anything bewteen repos 
start out with this basic "push" or "pull" primitive. (Engh, 
technically "fetch" is even more of a primitive than those, but 
I find it more helpful to think in terms of "push/pull" for the 
most typical daily tasks.)
No, the pair us push/fetch. pull is fetch+merge and a lot of 
confusion comes from that in fact. I've seen several people 
cursing git because of that idea that pull is the opposite of 
push. When I explained that they should never use git pull, but 
always separating fetch from the merge, it clicked every time.
So, avoid pull, look first what fetch does and if that is what 
you thought it would do, do the merge and be happy.




Re: git workflow for D

2017-12-03 Thread Patrick Schluter via Digitalmars-d-learn

On Monday, 4 December 2017 at 01:54:57 UTC, ketmar wrote:

Basile B. wrote:

On Sunday, 3 December 2017 at 22:22:47 UTC, Arun 
Chandrasekaran wrote:
Git CLI is arcane and esoteric. I've lost my commits before 
(yeah, my mistake).


Who hasn't ;)

me.

Happened to me last time because i tried a command supposed to 
remove untracked files in submodules...but used "reset" in a 
wrong way... ouch.


"git reflog". nothing commited is *ever* lost until you do "git 
gc".


This needs to be repeated: nothing in git is ever lost if it had 
been commited. You can lose untracked files, but commits do not 
disappear.
If you're unsure before an operation and have difficulties to use 
git reflog. Before doing the operation, do a simple git branch 
life-draft (or whatever you want). After the operation if it 
failed, you still have the commit your HEAD was on referenced by 
the life-draft branch.
branches and tags are just pointers in the directed graph a git 
repositery is. The interface only does not display the branches 
that have no entry pointer.


git sometimes does GC on its own, so you can turn it off

with:

git config --global gc.auto 0

don't forget to manually GC your repo then with "git gc", or it 
may grow quite huge.





Re: scope(exit) and Ctrl-C

2017-12-02 Thread Patrick Schluter via Digitalmars-d-learn

On Saturday, 2 December 2017 at 04:49:26 UTC, H. S. Teoh wrote:
On Sat, Dec 02, 2017 at 04:38:29AM +, Adam D. Ruppe via 
Digitalmars-d-learn wrote:

[...]


Signal handlers can potentially be invoked while inside a 
non-reentrant libc or OS function, so trying to do anything 
that (indirectly or otherwise) calls that function will cause 
havoc to your program.  Also, while the signal handler is 
running, some (all?) further signals may be blocked, meaning 
that your program might miss an important signal if your sig 
handler takes too long to run.  Furthermore, the signal may 
have happened in the middle of your own code, so race 
conditions may apply (e.g. if you're modifying global data in 
both).


[...]


On Linux you can use signalfd() for that, but nice trick if you 
want Posix portability.


Re: Private imports and Objects

2017-11-30 Thread Patrick Schluter via Digitalmars-d-learn
On Thursday, 30 November 2017 at 06:44:43 UTC, Jonathan M Davis 
wrote:
Object exists primarily because D didn't originally have 
templates, and when you don't have templates, having a single 
base class is the only way to have a function accept any class, 
and for something like a container, you'd pretty much be forced 
to use void* without Object if you don't have templates. 
However, once templates were added to D, the benefits of Object 
were significantly reduced, and it's arguably not a 
particularly good idea to be writing code that operates on 
Object. However, it's far too late in the game to get rid of 
Object.


And they come in very handy when one ports code from Java. As a 
first approach a simple 1 to 1 adaptation of the code makes the 
porting so much easier. Templates and D magic can come afterwards.





Re: ESR on post-C landscape

2017-11-16 Thread Patrick Schluter via Digitalmars-d-learn
On Tuesday, 14 November 2017 at 16:38:58 UTC, Ola Fosheim Grostad 
wrote:

On Tuesday, 14 November 2017 at 11:55:17 UTC, codephantom wrote:

[...]


Well, in another thread he talked about the Tango split, so not 
sure where he is coming from.



[...]


No, the starting point for C++ was that Simula is better for a 
specific kind of modelling than C.



[...]


It is flawed... ESR got that right, not sure how anyone can 
disagree. The only thing C has going for it is that CPU designs 
have been adapted to C for decades. But that is changing. C no 
longer models the hardware in a reasonable manner.


Because of the flawed interpretation of UB by the compiler 
writers, not because of a property of the language itself.


Re: ESR on post-C landscape

2017-11-16 Thread Patrick Schluter via Digitalmars-d-learn
On Tuesday, 14 November 2017 at 09:43:07 UTC, Ola Fosheim Grøstad 
wrote:

On Tuesday, 14 November 2017 at 06:32:55 UTC, lobo wrote:
"[snip]...Then came the day we discovered that a person we 
incautiously gave commit privileges to had fucked up the 
games’s AI core. It became apparent that I was the only dev on 
the team not too frightened of that code to go in. And I fixed 
it all right – took me two weeks of struggle. After which I 
swore a mighty oath never to go near C++ again. ...[snip]"


Either no one manages SW in his team so that this "bad" dev 
could run off and to build a monster architecture, which would 
take weeks, or this guy has no idea how to revert commit.


ESR got famous for his cathedral vs bazaar piece, which IMO was 
basically just a not very insightful allegory over waterfall vs 
evolutionary development models, but since many software 
developers don't know the basics of software development he 
managed to become infamous for it… But I think embracing 
emergence has hurt open source projects more than it has helped 
it. D bears signs of too much emergence too, and is still 
trying correct those «random moves» with DIPs.


ESR states «C is flawed, but it does have one immensely 
valuable property that C++ didn’t keep – if you can mentally 
model the hardware it’s running on, you can easily see all the 
way down. If C++ had actually eliminated C’s flaws (that it, 
been type-safe and memory-safe) giving away that transparency 
might be a trade worth making. As it is, nope.»


I don't think this is true, you can reduce C++ down to the 
level where it is just like C. If he cannot mentally model the 
hardware in C++ that basically just means he has never tried to 
get there…


The shear amount of inscrutable cruft and rules, plus the moving 
target of continuously changing semantics an order or two of 
magnitude bigger than C added to the fact that you still need to 
know C's gotchas, makes it one or two order of magnitude more 
difficult to mental model the hardware. You can also mental model 
the hardware with Intercal, if you haven't managed just means you 
haven't tried hard enough.




I also think he is in denial if he does not see that C++ is 
taking over C. Starting a big project in C today sounds like a 
very bad idea to me.


Even worse in C++ with its changing standards ever 5 years.






Re: What the hell is wrong with D?

2017-09-23 Thread Patrick Schluter via Digitalmars-d-learn
On Tuesday, 19 September 2017 at 18:34:13 UTC, Brad Anderson 
wrote:

On Tuesday, 19 September 2017 at 18:17:47 UTC, jmh530 wrote:
On Tuesday, 19 September 2017 at 17:40:20 UTC, EntangledQuanta 
wrote:


Thanks for wasting some of my life... Just curious about who 
will justify the behavior and what excuses they will give.


Pretty sure it would be exactly the same thing in C...


It is (and Java and C# and pretty much every other C style 
language though the nicer implicit conversion rules means it 
gets caught more easily). It is a big source of programmer 
mistakes. It comes up frequently in PVS Studio's open source 
analysis write ups.


So I checked for all the languages listed: C, C#, Java, 
Javascript, C++, PHP, Perl and D. All have the same order of 
precedence except, as always the abomination of all languages: 
C++ (kill it with fire).
C++ is the only language that has the ternary operator have the 
same precedence than the assignment operators.
This means a>=5?b=100:b=200; will compile in C++ but not in all 
the other languages. That's one reason why it irritates me when 
people continuously refer to C and C++ as if it was the same 
thing (yes I mean you Walter and Andrei).
Even PHP and Perl got it right, isn't that testament of poor 
taste Bjarne?. :-)





Re: What the hell is wrong with D?

2017-09-22 Thread Patrick Schluter via Digitalmars-d-learn
On Tuesday, 19 September 2017 at 18:34:13 UTC, Brad Anderson 
wrote:

On Tuesday, 19 September 2017 at 18:17:47 UTC, jmh530 wrote:
On Tuesday, 19 September 2017 at 17:40:20 UTC, EntangledQuanta 
wrote:


Thanks for wasting some of my life... Just curious about who 
will justify the behavior and what excuses they will give.


Pretty sure it would be exactly the same thing in C...


It is (and Java and C# and pretty much every other C style 
language though the nicer implicit conversion rules means it 
gets caught more easily). It is a big source of programmer 
mistakes. It comes up frequently in PVS Studio's open source 
analysis write ups.


So I checked for all the languages listed: C, C#, Java, 
Javascript, C++, PHP, Perl and D. All have the same order of 
precedence except, as always the abomination of all languages: 
C++ (kill it with fire).
C++ is the only language that has the ternary operator have the 
same precedence than the assignment operators.
This means a>=5?b=100:b=200; will compile in C++ but not in all 
the other languages. That's one reason why it irritates me when 
people continuously refer to C and C++ as if it was the same 
thing (yes I mean you Walter and Andrei).
Even PHP and Perl got it right, isn't that testament of poor 
taste Bjarne?. :-)





Re: 24-bit int

2017-09-03 Thread Patrick Schluter via Digitalmars-d-learn

On Friday, 1 September 2017 at 22:10:43 UTC, Biotronic wrote:
On Friday, 1 September 2017 at 19:39:14 UTC, EntangledQuanta 
wrote:
Is there a way to create a 24-bit int? One that for all 
practical purposes acts as such? This is for 24-bit stuff like 
audio. It would respect endianness, allow for arrays int24[] 
that work properly, etc.


I haven't looked at endianness beyond it working on my 
computer. If you have special needs in that regard, consider 
this a starting point:


big endian is indeed problematic.


@property
int value(int x) {
_payload = (cast(ubyte*))[0..3];
return value;
}



will not work on big endian machine.

version(BigEndian)
_payload = (cast(ubyte*))[1..4];



Re: Why structs and classes instanciations are made differently ?

2017-07-26 Thread Patrick Schluter via Digitalmars-d-learn
On Monday, 24 July 2017 at 17:42:30 UTC, Steven Schveighoffer 
wrote:

On 7/24/17 11:45 AM, Houdini wrote:
On Monday, 24 July 2017 at 15:41:33 UTC, Steven Schveighoffer 
wrote:


Because types with inheritance generally don't work right if 
you pass by value (i.e. the slicing problem).


structs don't support inheritance or virtual functions, so 
they can be safely passed by value.


But in C++, we pass them by reference also to avoid copies 
(const &).
The potential polymorphic usage is not the only point to 
consider.




In C++ class and struct are pretty much interchangeable, so 
technically, class is a wasted keyword for default visibility.


In D, I would use classes for any time I need polymorphism, and 
use structs otherwise.


-Steve


It has also the nice property that porting code from Java/C# is 
actually really easy when using classes as it has more or less 
the same semantic. When porting code from C and C++ it is often 
better to use structs.




Re: Funny issue with casting double to ulong

2017-07-04 Thread Patrick Schluter via Digitalmars-d-learn

On Tuesday, 4 July 2017 at 00:35:10 UTC, H. S. Teoh wrote:
On Mon, Jul 03, 2017 at 07:13:45AM +, Era Scarecrow via 
Digitalmars-d-learn wrote:

On Monday, 3 July 2017 at 06:20:22 UTC, H. S. Teoh wrote:

[...]
> I don't think there's a way to change how the FPU works -- 
> the hardware is coded that way and can't be changed.  You'd 
> have to build your own library or use an existing one for 
> this purpose.


 It's been a while, i do recall there was BCD options, 
actually found
a few of the instructions; However they are more on 
loading/storing
the value, not on working strictly in that mode. Last i 
remember

seeing references to BCD work was in 2000 or so.

 I'll have to look further before i find (or fail to find) all 
that's
BCD related. Still if it IS avaliable, it would be an x87 only 
option
and thus wouldn't be portable unless the language or a library 
offered

support.


Wow, that brings back the memories... I used to dabble with BCD 
(only a little bit) back when I was playing with 8086/8088 
assembly language. But I've not heard anything about BCD since 
that era, and I'm surprised people still know what it is. :-D  
But all I knew about BCD was in the realm of integer 
arithmetic. I had no idea such things as BCD floats existed.


In times of lore, BCD floats were very common. The Sharp Pocket 
Computer used a BCD float format and writing machine code on them 
confronts one with the format. The TI-99/4A home computer also 
used a BCD float format in its Basic interpreter. It had the same 
properties as the float format of the TI calculators, i.e. 10 
visible significant digits (+ 3 hidden digits) and exponents 
going from -99 to +99. It is only then when I switched to the 
Apple II Applesoft Basic that I discovered the horrors of binary 
floating point numbers.
Since the generalization of arithmetic co-processors does one 
only see binary floats anymore.





Re: Funny issue with casting double to ulong

2017-07-03 Thread Patrick Schluter via Digitalmars-d-learn

On Monday, 3 July 2017 at 05:38:56 UTC, Era Scarecrow wrote:

On Monday, 3 July 2017 at 03:57:25 UTC, Basile B wrote:

6.251 has no perfect double representation. It's real value is:


 I almost wonder if a BCD, fixed length or alternative for 
floating point should be an option... Either library, or a hook 
to change how the FPU works since doubles are suppose to do 
16-18 digits of perfect simple floatingpoint for the purposes 
of money and the like without relying on such imperfect 
transitions.


IBM zSeries and POWER since POWER6 have BCD floating point unit...


Re: How to implement opCmp?

2017-06-13 Thread Patrick Schluter via Digitalmars-d-learn

On Tuesday, 13 June 2017 at 16:49:14 UTC, H. S. Teoh wrote:
On Tue, Jun 13, 2017 at 10:51:40AM -0400, Steven Schveighoffer 
via Digitalmars-d-learn wrote: [...]
I think Andrei has a nice way to do opCmp for integers that's 
a simple subtraction and negation or something like that.

[...]

In theory, cmp(int x, int y) can be implemented simply as (x - 
y).
However, this fails when integer overflow occurs.  Does Andrei 
have a
nice way of doing this that isn't vulnerable to integer 
overflow?




return (x > y) - (x < y);

According to this stackoverflow question it looks like a good 
candidate

https://stackoverflow.com/questions/10996418/efficient-integer-compare-function





Re: Use template functions within mixin

2017-06-06 Thread Patrick Schluter via Digitalmars-d-learn

On Tuesday, 6 June 2017 at 15:00:50 UTC, Timoses wrote:

Hey there,

I'm wondering how I can use a template function within my mixin:

```
ubyte[] value = x[33, 3a,3f, d4];
foreach (type; TypeTuple!("int", "unsigned 
int", "byte"))

{
mixin(`if (value.length == type.sizeof)
{
ubyte[type.sizeof] raw = 
value[0..$];
auto fValue = 
raw.littleEndianToNative!(type);

displayinfo(fValue);
}
break;
`);
}
```


Error: template std.bitmanip.littleEndianToNative cannot deduce 
function from argument types !("int")(ubyte[8]), candidates are:
[..]\src\phobos\std\bitmanip.d(2560,3):
std.bitmanip.littleEndianToNative(T, uint n)(ubyte[n] val) if 
(canSwapEndianness!T && n == T.sizeof)


```
`raw.littleEndianToNative!` ~ type ~ `;`
```

Did you also put the ` ~ type ~ ` on the 2 other cases where you 
use the variable type?


 mixin(`if (value.length == ` ~ type ~ `.sizeof)
   {
  ubyte[` ~ type ~ `.sizeof] raw = value[0..$];
  auto fValue = raw.littleEndianToNative!(` ~ 
type ~ `);

  displayinfo(fValue);
}
 break;
`);



Re: std.path.buildPath

2017-06-04 Thread Patrick Schluter via Digitalmars-d-learn

On Sunday, 4 June 2017 at 15:56:58 UTC, Jacob Carlborg wrote:

On 2017-06-04 07:44, Jesse Phillips wrote:

What is your expected behavior? Throw an exception? You can't 
really

append an absolute path to another.


Of course you can. I expect buildPath("/foo", "/bar") to result 
in "/foo/bar". That's how Ruby behaves.


buildPath("/usr/bin", "/usr/bin/gcc")

/usr/bin/usr/bin/gcc is obviously wrong. I think the semantic is 
not as illogical as it seem at first glance.


Re: Creating and loading D plugins in D app

2017-06-03 Thread Patrick Schluter via Digitalmars-d-learn

On Saturday, 3 June 2017 at 09:41:58 UTC, aberba wrote:

On Friday, 2 June 2017 at 16:36:34 UTC, H. S. Teoh wrote:
On Fri, Jun 02, 2017 at 12:19:48PM +, Adam D. Ruppe via 
Digitalmars-d-learn wrote:

On Friday, 2 June 2017 at 11:09:05 UTC, aberba wrote:
> 1. Get shared libs to work in D (the best approach for all 
> D code)


I have done very little with this myself but other people 
have so it is doable.

[...]

This is not directly related to the OP's question, but 
recently I wrote a program that, given a user-specified 
string, transforms it into D code using a code template, 
invokes dmd to compile it into a shared object, loads the 
shared object using dlopen(), and looks up the generated 
function with dlsym() to obtain a function pointer that can be 
used for calling the function. The shared object is unloaded 
after it's done.


Will be of much use to me to see the brief instructions for 
this. I saw the C style on Wikipedia.


Seems the functions loaded needs to be casted from void* to a 
type... before calling. Didn't quite understand that part.


Yes, dlsym() returns the address of the object of the shared 
library you requested. The first parameter is the handle to the 
shared object that had been loaded by dlopen(). The second 
parameter is the name of the object one wants the address of.
The address must then be casted to the type of the object. If the 
name was one of a function, one has to cast to a function 
pointer. That's something illegal in strict C (i.e. undefined 
behaviour.) but is something that is required by Posix. It's 
really not very difficult.


Re: howto count lines - fast

2017-05-31 Thread Patrick Schluter via Digitalmars-d-learn

On Thursday, 1 June 2017 at 04:39:17 UTC, Jonathan M Davis wrote:
On Wednesday, May 31, 2017 16:03:54 H. S. Teoh via 
Digitalmars-d-learn wrote:

[...]

Digitalmars-d-learn wrote:

[...]


If you're really trying to make it fast, there may be something 
that you can do with SIMD. IIRC, Brian Schott did that with his 
lexer (or maybe he was just talking about it - I don't remember 
for sure).




See my link above to realdworldtech. Using SIMD can give good 
results in micro-benchmarks but completely screw up performance 
of other things in practice (the alignment requirements are heavy 
and result in code bloat, cache misses, TLB misses, cost of 
context switches, AVX warm up time (Agner Fog observed around 
1 cycles before AVX switches from 128 bits to 256 bits 
operations), reduced turboing, etc.).


Re: howto count lines - fast

2017-05-31 Thread Patrick Schluter via Digitalmars-d-learn

On Wednesday, 31 May 2017 at 23:03:54 UTC, H. S. Teoh wrote:
On Wed, May 31, 2017 at 03:46:17PM -0700, Jonathan M Davis via 
Digitalmars-d-learn wrote:
On Wednesday, May 31, 2017 12:13:04 H. S. Teoh via 
Digitalmars-d-learn wrote:
> I did some digging around, and it seems that wc is using 
> glibc's memchr, which is highly-optimized, whereas 
> std.algorithm.count just uses a simplistic loop. Which is 
> strange, because I'm pretty sure somebody optimized 
> std.algorithm some time ago to use memchr() instead of a 
> loop when searching for a byte value in an array. Whatever 
> happened to that??


I don't know, but memchr wouldn't work with CTFE, so someone 
might have removed it to make it work in CTFE (though that 
could be done with a different branch for CTFE). Or maybe it 
never made it into std.algorithm for one reason or another.

[...]

I checked the Phobos code again, and it appears that my memory 
deceived
me. Somebody *did* add memchr optimization to find() and its 
friends,

but not to count().

CTFE compatibility is not a problem, since we can just 
if(__ctfe) the

optimized block away.

I'm currently experimenting with a memchr-optimized version of 
count(), but I'm getting mixed results: on small arrays or 
large arrays densely packed with matching elements, the memchr 
version runs rather slowly, because it involves a function call 
into the C library per matching element.  On large arrays only 
sparsely populated with matching elements, though, the 
memchr-optimized version beats the current code by about an 
order of magnitude.


Since it wouldn't be a wise idea to assume sparsity of matches 
in Phobos, I decided to do a little more digging, and looked up 
the glibc implementation of memchr. The main optimization is 
that it iterates over the array not by byte, as a naïve loop 
would do, but by ulong's.


That's what I suggested above. It's the first optimisation to do 
when looping over a buffer (memcpy, memset, memchr etc.).



 (Of course, the first n bytes and
last n bytes that are not ulong-aligned are checked with a 
per-byte loop; so for very short arrays it doesn't lose out to 
the naïve loop.)  In each iteration over ulong, it performs the 
bit-twiddling hack alluded to by Nitram to detect the presence 
of matching bytes, upon which it breaks out to the closing 
per-byte loop to find the first match. For short arrays, or 
arrays where a match is quickly found, it's comparable in 
performance to the naïve loop; for large arrays where the match 
is not found until later, it easily outperforms the naïve loop.


It is also important to not overdo the optimisations as it can 
happen that the overhead generated manifests in pessimations not 
visible in a specific benchmark. The code size explosion may 
induce I-cache misses, it can also cost I-TLB misses. Worse, 
using SSE or AVX can kill thread switch time or worse even reduce 
the turboing of the CPU.
It's currently a hot topic on realworldtech[1]. Linus Torvalds 
rants about this issue wit memcpy() which is over-engineered and 
does more harm than good in practice but has nice benchmark 
result.




My current thought is to adopt the same approach: iterate over 
size_t or some such larger unit, and adapt the bit-twiddling 
hack to be able to count the number of matches in each size_t.  
This is turning out to be trickier than I'd like, though, 
because there is a case where carry propagation makes it 
unclear how to derive the number of matches without iterating 
over the bytes a second time.


But this may not be a big problem, since size_t.sizeof is 
relatively small, so I can probably loop over individual bytes 
when one or more matches is detected, and a 
sufficiently-capable optimizer like ldc or gdc would be able to 
unroll this into a series of sete + add instructions, no 
branches that might stall the CPU pipeline. For 
densely-matching arrays, this should still have comparable 
performance to the naïve loops; for sparsely-matching arrays 
this should show significant speedups.


That's what I think too, that a small and simple loop to count 
the matching bytes in the ulong would be a somehow faster than 
the bit twiddling trick which requires a population count of bits.


[1]: 
http://www.realworldtech.com/forum/?threadid=168200=168700




Re: howto count lines - fast

2017-05-31 Thread Patrick Schluter via Digitalmars-d-learn

On Tuesday, 30 May 2017 at 23:41:01 UTC, H. S. Teoh wrote:
On Tue, May 30, 2017 at 08:02:38PM +, Nitram via 
Digitalmars-d-learn wrote:
After reading 
https://dlang.org/blog/2017/05/24/faster-command-line-tools-in-d/ , i was wondering how fast one can do a simple "wc -l" in D.



size_t lineCount3(string filename)
{
import std.mmfile;

auto f = new MmFile(filename);
auto data = cast(ubyte[]) f[];
size_t c;

foreach (i; 0 .. data.length)
{
if (data[i] == '\n') c++;
}
return c;
}

// real0m0.242s
// user0m1.151s
// sys 0m0.057s


You should try something more like
auto data = cast(ulong[]) f[];

 foreach (i; 0 .. data.length/ulong.sizeof)

and then using bitfiedling to count the number of \n in the 
loaded ulong. This divides by 8 the number of load instructions. 
The counting of \n in the loaded word then only uses registers. 
It is also possible to use bit fiddling to detect and count the 
characters in that ulong. I don't know if it is really faster then


Here a function to detect if a given character is in an ulong

auto detect(alias CHAR)(ulong t)
{
  enum ulong u = CHAR;
  enum mask1 = u|(u<<8)|(u<<16)|(u<<24UL)|(u<<32UL);
  enum mask = (mask1<<32)|mask1;
  return ((t^mask) - 0x0101010101010101UL) & ~(t^mask) & 
0x8080808080808080UL;

}

The returned value is 0 if the character is not in t. And the 
highest bit of each byte is set if it contained the character. If 
the CPU has a fast popcnt it should be easy to count.





Re: The syntax of sort and templates

2017-05-26 Thread Patrick Schluter via Digitalmars-d-learn

On Friday, 26 May 2017 at 09:59:26 UTC, zakk wrote:

Hello everyone,

I just started using D and I am a bit puzzled by the syntax of 
the sort function is std.algorithm.sorting, which is


sort!(comparingFunction)(list)

where comparingFunction is often a lambda expression. For 
instance in the Wolfram Language the equivalent function is


Sort[list,comparingFunction]

My questions are:

1) Why is D making using of the binary ! operator, which as far 
as I understand introduces a template?


! here indicates where the template parameter start in a template 
invocation


templatename!(compile time parameter)(runtime parameter)



2) Why is a template needed here?


The template allows to generate the code so that the comparison 
is known at compile time and can be optimised properly. If you 
look for example in C, where sorting is done via the qsort() 
function. The comparison function must be provided by a function 
pointer. This means that the qsort function must call a function 
for doing even the simplest comparison. Furthermore, this call is 
indirect which on some processors can not be predicted and takes 
an inordinary long time to run.
Another nuisance associated with qsort and function pointer in C 
is that you have to define a specific function, with a specific 
name doing the type conversions because qsort only works with 
void * as parameter. This makes it slow, cumbersome and error 
prone.
All these defaults are inexistant in D thanks to the template, 
which take the lambda, i.e. an anonymous function, which is 
defined with the right types and inserts it in the sort code as 
if it had been written by hand.
The template replaces the macro preprocessor of C, but at more 
profound and intimate language level.




3) It seems to me like the argument passed to the template is a 
lambda expression. I only know about templates taking types as 
argument. What's going on?


That's the strength of D that templates are not limited to types. 
Almost anything can be templatized which opens possibilities that 
other languages don't even start to be able to conceive.




Many thanks!





Re: No tempFile() in std.file

2017-05-16 Thread Patrick Schluter via Digitalmars-d-learn

On Wednesday, 17 May 2017 at 05:30:40 UTC, Patrick Schluter wrote:

On Tuesday, 16 May 2017 at 13:56:57 UTC, Jonathan M Davis wrote:

[...]


As your solution doesn't inherently solve the race condition 
associated with temporary files, you could still generate the 
name with a wrapper around tempnam() or tmpnam() (Posix for 
Windows I don't know). This would avoid the double open() of 
the scenario above.


But as Jonathan said above, this is not a good solution in any 
case. In Posix the use the mks*temp() family of functions is 
standard now.


  1   2   >