How to know whether a file's encoding is ansi or utf8?

2014-07-22 Thread Sam Hu via Digitalmars-d-learn

Greetings!

As subjected,how can I know whether a file is in UTF8 encoding or 
ansi?


Thanks for the help in advance.

Regards,
Sam


Re: How to know whether a file's encoding is ansi or utf8?

2014-07-22 Thread Sam Hu via Digitalmars-d-learn

On Tuesday, 22 July 2014 at 09:50:00 UTC, Sam Hu wrote:

Greetings!

As subjected,how can I know whether a file is in UTF8 encoding 
or ansi?


Thanks for the help in advance.

Regards,
Sam


Sorry,I mean by by code,for example,when I try to read a file 
content and printed to a text control in GUI,or to console,will 
proceed differently regarding file encoding.


Re: fork/waitpid and std.concurrency.spawn

2014-07-22 Thread FreeSlave via Digitalmars-d-learn

On Tuesday, 22 July 2014 at 07:58:50 UTC, Puming wrote:

Is there a fork()/wait() API similar to std.concurrency spawn()?

The best thing I've got so far is module 
core.sys.posix.unistd.fork(), but it seems to only work in 
posix. Is there a unified API for process level concurrency? 
ideally with actor and send message support too.


You need std.process.


Re: How to know whether a file's encoding is ansi or utf8?

2014-07-22 Thread Alexandre via Digitalmars-d-learn

Read the BOM ?

module main;

import std.stdio;

enum Encoding
{
UTF7,
UTF8,
UTF32,
Unicode,
BigEndianUnicode,
ASCII
};

Encoding GetFileEncoding(string fileName)
{
import std.file;
auto bom = cast(ubyte[]) read(fileName, 4);

if (bom[0] == 0x2b  bom[1] == 0x2f  bom[2] == 0x76)
return Encoding.UTF7;
if (bom[0] == 0xef  bom[1] == 0xbb  bom[2] == 0xbf)
return Encoding.UTF8;
if (bom[0] == 0xff  bom[1] == 0xfe)
return Encoding.Unicode; //UTF-16LE
if (bom[0] == 0xfe  bom[1] == 0xff)
return Encoding.BigEndianUnicode; //UTF-16BE
	if (bom[0] == 0  bom[1] == 0  bom[2] == 0xfe  bom[3] == 
0xff)

return Encoding.UTF32;

return Encoding.ASCII;
}

void main(string[] args)
{
if(GetFileEncoding(test.txt) == Encoding.UTF8)
writeln(The file is UTF8);
else
writeln(File is not UTF8 :();
}



On Tuesday, 22 July 2014 at 09:50:00 UTC, Sam Hu wrote:

Greetings!

As subjected,how can I know whether a file is in UTF8 encoding 
or ansi?


Thanks for the help in advance.

Regards,
Sam




Re: Calling dynamically bound functions from weakly pure function

2014-07-22 Thread Rene Zwanenburg via Digitalmars-d-learn

On Saturday, 19 July 2014 at 11:12:00 UTC, Marc Schütz wrote:
Casting to pure would break purity if the called function is 
not actually pure. AFAIU, the problem is that the mutable 
function pointers are not accessible from inside the pure 
function at all, in which case the solution is to cast them to 
immutable, not to pure.


Indeed that is the problem. I didn't think of casting to 
immutable, that should work..




But to cast something, you'd need to have access to it in the 
first place...


This seems to work:

int function(int) pure my_func_ptr;

struct CallImmutable {
static opDispatch(string fn, Args...)(Args args) {
return mixin(fn)(args);
}
}

int test() pure {
return CallImmutable.my_func_ptr(1);
}

But I suspect it's because of a bug. `CallImmutable.opDispatch` 
should not be deduced to be pure, and this not be callable from 
`test`.


Yeah that looks like a bug. I should be able to conjure something 
up, perhaps using assumeUnique, that won't break in newer 
versions.


Thanks for the answers!


Re: fork/waitpid and std.concurrency.spawn

2014-07-22 Thread Puming via Digitalmars-d-learn
I've only found spawnProcess/spawnShell and the like, which 
executes a new command, but not a function pointer, like fork() 
and std.concurrency.spawn does.


What is the function that does what I describe?

On Tuesday, 22 July 2014 at 10:43:58 UTC, FreeSlave wrote:

On Tuesday, 22 July 2014 at 07:58:50 UTC, Puming wrote:
Is there a fork()/wait() API similar to std.concurrency 
spawn()?


The best thing I've got so far is module 
core.sys.posix.unistd.fork(), but it seems to only work in 
posix. Is there a unified API for process level concurrency? 
ideally with actor and send message support too.


You need std.process.




Re: How to know whether a file's encoding is ansi or utf8?

2014-07-22 Thread Sam Hu via Digitalmars-d-learn

On Tuesday, 22 July 2014 at 11:59:34 UTC, Alexandre wrote:

Read the BOM ?

module main;

import std.stdio;

enum Encoding
{
UTF7,
UTF8,
UTF32,
Unicode,
BigEndianUnicode,
ASCII
};

Encoding GetFileEncoding(string fileName)
{
import std.file;
auto bom = cast(ubyte[]) read(fileName, 4);

if (bom[0] == 0x2b  bom[1] == 0x2f  bom[2] == 0x76)
return Encoding.UTF7;
if (bom[0] == 0xef  bom[1] == 0xbb  bom[2] == 0xbf)
return Encoding.UTF8;
if (bom[0] == 0xff  bom[1] == 0xfe)
return Encoding.Unicode; //UTF-16LE
if (bom[0] == 0xfe  bom[1] == 0xff)
return Encoding.BigEndianUnicode; //UTF-16BE
	if (bom[0] == 0  bom[1] == 0  bom[2] == 0xfe  bom[3] == 
0xff)

return Encoding.UTF32;

return Encoding.ASCII;
}

void main(string[] args)
{
if(GetFileEncoding(test.txt) == Encoding.UTF8)
writeln(The file is UTF8);
else
writeln(File is not UTF8 :();
}



On Tuesday, 22 July 2014 at 09:50:00 UTC, Sam Hu wrote:

Greetings!

As subjected,how can I know whether a file is in UTF8 encoding 
or ansi?


Thanks for the help in advance.

Regards,
Sam


Thanks. This is exactly what I want at this moment.


Re: How to know whether a file's encoding is ansi or utf8?

2014-07-22 Thread FreeSlave via Digitalmars-d-learn
Note that BOMs are optional and may be not presented in Unicode 
file. Also presence of leading bytes which look BOM does not 
necessarily mean that file is encoded in some kind of Unicode.


Re: Map one tuple to another Tuple of different type

2014-07-22 Thread Vlad Levenfeld via Digitalmars-d-learn
I'm just confused about how static while is supposed to work 
because static foreach, to my understanding, would have to work 
by making a new type for each iteration. I say this because, 1) 
runtime foreach works like that (with type = range), and 2) 
without ctfe foreach, the only way I know of to iterate a 
typelist is to make a new type with one less element, so I 
imagine static foreach lowers to that.


I suppose its possible to make a struct with static immutable 
start and end iterators, and make new types out of advancing the 
start iterator until it was equal to the end. Seems like a step 
backward though.


Anyway my actual question is: if all values are constant at 
compile time, how would a static while loop terminate?


Re: How to know whether a file's encoding is ansi or utf8?

2014-07-22 Thread Alexandre via Digitalmars-d-learn

http://www.architectshack.com/TextFileEncodingDetector.ashx

On Tuesday, 22 July 2014 at 15:53:23 UTC, FreeSlave wrote:
Note that BOMs are optional and may be not presented in Unicode 
file. Also presence of leading bytes which look BOM does not 
necessarily mean that file is encoded in some kind of Unicode.



There are several difficulties in this case ...


Re: fork/waitpid and std.concurrency.spawn

2014-07-22 Thread FreeSlave via Digitalmars-d-learn

On Tuesday, 22 July 2014 at 14:26:05 UTC, Puming wrote:
I've only found spawnProcess/spawnShell and the like, which 
executes a new command, but not a function pointer, like fork() 
and std.concurrency.spawn does.


What is the function that does what I describe?

On Tuesday, 22 July 2014 at 10:43:58 UTC, FreeSlave wrote:

On Tuesday, 22 July 2014 at 07:58:50 UTC, Puming wrote:
Is there a fork()/wait() API similar to std.concurrency 
spawn()?


The best thing I've got so far is module 
core.sys.posix.unistd.fork(), but it seems to only work in 
posix. Is there a unified API for process level concurrency? 
ideally with actor and send message support too.


You need std.process.


I'm not sure what you're trying to do. Posix fork does not just 
spawn function, it spawns new process as copy of its parent and 
continue execution from the point where fork returns.
Windows creates processes in some different way, and it seems 
there is no function with same functionality as Posix fork in 
WinAPI (by the way you can try to find some implementations on 
the Internet / use Cygwin / try to use Microsoft Posix Subsystem).
I think the reason why phobos does not have functionality you 
want is that standard library should be platform-agnostic. So 
instead of emulating things which are not supported by some 
platform, it just truncates them.


Re: Map one tuple to another Tuple of different type

2014-07-22 Thread H. S. Teoh via Digitalmars-d-learn
On Tue, Jul 22, 2014 at 03:52:14PM +, Vlad Levenfeld via 
Digitalmars-d-learn wrote:
 I'm just confused about how static while is supposed to work because
 static foreach, to my understanding, would have to work by making a
 new type for each iteration. I say this because, 1) runtime foreach
 works like that (with type = range), and 2) without ctfe foreach, the
 only way I know of to iterate a typelist is to make a new type with
 one less element, so I imagine static foreach lowers to that.
 
 I suppose its possible to make a struct with static immutable start
 and end iterators, and make new types out of advancing the start
 iterator until it was equal to the end. Seems like a step backward
 though.
 
 Anyway my actual question is: if all values are constant at compile
 time, how would a static while loop terminate?

Basically, think of it as custom loop unrolling:

TypeTuple!(
int, x,
float, y,
uint, z
) t;

// This loop:
foreach (i; staticIota(0, 3)) {
t[i]++;
}

// Is equivalent to:
t[0]++;
t[1]++;
t[2]++;

// Which is equivalent to:
t.x++;
t.y++;
t.z++;

The loop body is basically expanded for each iteration, with the loop
variable suitably substituted with each element of the typelist.


T

-- 
Microsoft is to operating systems  security ... what McDonalds is to gourmet 
cooking.


Need help with basic functional programming

2014-07-22 Thread Eric via Digitalmars-d-learn


I have been writing several lexers and parsers. The grammars I 
need to
parse are really complex, and consequently I didn't feel 
confident about
the code quality, especially in the lexers.  So I decided to jump 
on the functional progamming bandwagon to see if that would help. 
 It definitely
does help, there are fewer lines of code, and I feel better about 
the code
quality.  I started at the high level, and had the input buffer 
return a
range of characters, and the lexer return a range of tokens.  But 
when I got
down to the lower levels of building up tokens, I ran into a 
problem:


First I started with this which worked:

private void getNumber(MCInputStreamRange buf)
{
while (!buf.empty())
{
p++;
buf.popFront();
if (buf.front() = '0' || buf.front() = '9') break;
*p = buf.front();
}
curTok.kind = Token_t.NUMBER;
curTok.image = cast(string) cbuffer[0 .. (p - 
cbuffer.ptr)].dup;

}

I thought I could improve this like so:

private void getNumber(MCInputStreamRange buf)
{
auto s = buf.until(a = '0' || a = '9');
curTok.kind = Token_t.NUMBER;
curTok.image = to!string(s);
}

The problem is that until seems to not stop at the end of the 
number,
and instead continues until the end of the buffer.  Am I doing 
something
wrong here?  Also, what is the fastest way to convert a range to 
a string?


Thanks,

Eric
















Re: Need help with basic functional programming

2014-07-22 Thread bearophile via Digitalmars-d-learn

Eric:


while (!buf.empty())
{
p++;
buf.popFront();


Those () can be omitted, if you mind the noise (but you can also 
keep them).




if (buf.front() = '0' || buf.front() = '9') break;


std.ascii.isDigit helps.


curTok.image = cast(string) cbuffer[0 .. (p - 
cbuffer.ptr)].dup;


If you want a string, then idup is better. Try to minimize the 
number of casts in your code.




auto s = buf.until(a = '0' || a = '9');


Perhaps you need a ! after the until, or a !q{a = '0' || a = 
'9'}.




Also, what is the fastest way to convert a range to a string?


The text function is the simplest.

Bye,
bearophile


Re: Need help with basic functional programming

2014-07-22 Thread anonymous via Digitalmars-d-learn

On Tuesday, 22 July 2014 at 16:50:47 UTC, Eric wrote:

private void getNumber(MCInputStreamRange buf)
{
auto s = buf.until(a = '0' || a = '9');
curTok.kind = Token_t.NUMBER;
curTok.image = to!string(s);
}

The problem is that until seems to not stop at the end of the 
number,
and instead continues until the end of the buffer.  Am I doing 
something

wrong here?


You've forgotten the exclamation mark: buf.until!(...)
Without it, the string is not the predicate, but the sentinel
value. I.e. the range stops when it sees the characters a = '0'
|| a = '9'.

By the way, do you really mean to stop on '0' and '9'? Do you
perhaps mean a  '0' || a  '9'?


 Also, what is the fastest way to convert a range to a string?


The fastest to type is probably text(r) (or r.text). The fastest
for me to come up with is r.to!string, which does exactly the
same. I don't know about run time, but text/to!string is
hopefully fine.


Re: Need help with basic functional programming

2014-07-22 Thread Eric via Digitalmars-d-learn




By the way, do you really mean to stop on '0' and '9'? Do you
perhaps mean a  '0' || a  '9'?



Yes, my bad...


Re: Need help with basic functional programming

2014-07-22 Thread Eric via Digitalmars-d-learn

On Tuesday, 22 July 2014 at 17:09:29 UTC, bearophile wrote:

Eric:


   while (!buf.empty())
   {
   p++;
   buf.popFront();


Those () can be omitted, if you mind the noise (but you can 
also keep them).




   if (buf.front() = '0' || buf.front() = '9') break;


std.ascii.isDigit helps.


   curTok.image = cast(string) cbuffer[0 .. (p - 
cbuffer.ptr)].dup;


If you want a string, then idup is better. Try to minimize the 
number of casts in your code.




   auto s = buf.until(a = '0' || a = '9');


Perhaps you need a ! after the until, or a !q{a = '0' || a = 
'9'}.




Also, what is the fastest way to convert a range to a string?


The text function is the simplest.

Bye,
bearophile


Thanks!  All very good suggestions...

-Eric





Re: Need help with basic functional programming

2014-07-22 Thread via Digitalmars-d-learn

On Tuesday, 22 July 2014 at 17:09:29 UTC, bearophile wrote:

Eric:


   while (!buf.empty())
   {
   p++;
   buf.popFront();


Those () can be omitted, if you mind the noise (but you can 
also keep them).


Actually, the ones behind `empty` and `front` are wrong, because 
these are defined to be properties. They just happen to work 
currently.


Re: Map one tuple to another Tuple of different type

2014-07-22 Thread via Digitalmars-d-learn
On Tuesday, 22 July 2014 at 16:42:14 UTC, H. S. Teoh via 
Digitalmars-d-learn wrote:
On Tue, Jul 22, 2014 at 03:52:14PM +, Vlad Levenfeld via 
Digitalmars-d-learn wrote:
Anyway my actual question is: if all values are constant at 
compile

time, how would a static while loop terminate?


Basically, think of it as custom loop unrolling:

TypeTuple!(
int, x,
float, y,
uint, z
) t;

// This loop:
foreach (i; staticIota(0, 3)) {
t[i]++;
}

// Is equivalent to:
t[0]++;
t[1]++;
t[2]++;

// Which is equivalent to:
t.x++;
t.y++;
t.z++;

The loop body is basically expanded for each iteration, with 
the loop

variable suitably substituted with each element of the typelist.


You're misunderstanding him. Your example is a static foreach, 
but Vlad asked about static while. I too don't see how a static 
while is supposed to work.


Re: Map one tuple to another Tuple of different type

2014-07-22 Thread Vlad Levenfeld via Digitalmars-d-learn

Yes, though the loop unrolling is news to me. I'll have to keep
that in mind next time I'm trying to squeeze some extra
performance out of a loop.

btw, found a static switch enhancement request here:
https://issues.dlang.org/show_bug.cgi?id=6921