[Issue 9045] Feature request for std.asscii => function isNewline

2022-12-17 Thread d-bugmail--- via Digitalmars-d-bugs
https://issues.dlang.org/show_bug.cgi?id=9045

Iain Buclaw  changed:

   What|Removed |Added

   Priority|P2  |P4

--


[Issue 9045] Feature request for std.asscii => function isNewline

2019-11-20 Thread d-bugmail--- via Digitalmars-d-bugs
https://issues.dlang.org/show_bug.cgi?id=9045

Vladimir Panteleev  changed:

   What|Removed |Added

   Priority|P5  |P2
 Status|REOPENED|NEW
  Component|druntime|phobos

--- Comment #16 from Vladimir Panteleev  ---
Undoing vandalism(?) by Parmigiano

--


[Issue 9045] Feature request for std.asscii => function isNewline

2019-11-20 Thread d-bugmail--- via Digitalmars-d-bugs
https://issues.dlang.org/show_bug.cgi?id=9045

Vladimir Panteleev  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 CC||dlang-bugzilla@thecybershad
   ||ow.net
 Resolution|MOVED   |---

--


[Issue 9045] Feature request for std.asscii => function isNewline

2019-11-19 Thread d-bugmail--- via Digitalmars-d-bugs
https://issues.dlang.org/show_bug.cgi?id=9045

--- Comment #15 from berni44  ---
Where has this been moved to?

--


[Issue 9045] Feature request for std.asscii => function isNewline

2019-11-19 Thread d-bugmail--- via Digitalmars-d-bugs
https://issues.dlang.org/show_bug.cgi?id=9045

Parmigiano2  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |MOVED

--


[Issue 9045] Feature request for std.asscii => function isNewline

2019-11-19 Thread d-bugmail--- via Digitalmars-d-bugs
https://issues.dlang.org/show_bug.cgi?id=9045

Parmigiano2  changed:

   What|Removed |Added

   Priority|P2  |P5
 Status|NEW |ASSIGNED
 CC||i6o34a+7y7j1p606trkc@sharkl
   ||asers.com
  Component|phobos  |druntime

--


[Issue 9045] Feature request for std.asscii => function isNewline

2019-11-19 Thread d-bugmail--- via Digitalmars-d-bugs
https://issues.dlang.org/show_bug.cgi?id=9045

berni44  changed:

   What|Removed |Added

 CC||bugzi...@d-ecke.de
   Severity|normal  |enhancement

--


[Issue 9045] Feature request for std.asscii => function isNewline

2016-03-22 Thread via Digitalmars-d-bugs
https://issues.dlang.org/show_bug.cgi?id=9045

--- Comment #13 from Dmitry Olshansky  ---
(In reply to Nick Sabalausky from comment #10)
> While an 'isNewline(dchar)' func wouldn't work for reasons already
> discussed, what *would* work and be very helpful IMO, is something similar
> to the 'std.conv.parse(...)' functions. Ie, just like this, but properly
> templated, range-ified and UTF8/16-ified:

skipNewLine taking string by ref that would return a bool indicating if there
was a new line would indeed be useful.

--


[Issue 9045] Feature request for std.asscii => function isNewline

2016-03-22 Thread via Digitalmars-d-bugs
https://issues.dlang.org/show_bug.cgi?id=9045

--- Comment #12 from Dmitry Olshansky  ---
(In reply to Nick Sabalausky from comment #10)
> While an 'isNewline(dchar)' func wouldn't work for reasons already
> discussed, what *would* work and be very helpful IMO, is something similar
> to the 'std.conv.parse(...)' functions. Ie, just like this, but properly
> templated, range-ified and UTF8/16-ified:

skipNewLine taking string by ref that would return a bool indicating if there
was a new line would indeed be useful.

--


[Issue 9045] Feature request for std.asscii = function isNewline

2013-03-28 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=9045



--- Comment #11 from Nick Sabalausky cbkbbej...@mailinator.com 2013-03-28 
17:35:23 PDT ---
*** Issue 8880 has been marked as a duplicate of this issue. ***

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 9045] Feature request for std.asscii = function isNewline

2013-03-26 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=9045


Nick Sabalausky cbkbbej...@mailinator.com changed:

   What|Removed |Added

 CC||cbkbbej...@mailinator.com


--- Comment #10 from Nick Sabalausky cbkbbej...@mailinator.com 2013-03-26 
10:04:56 PDT ---
While an 'isNewline(dchar)' func wouldn't work for reasons already discussed,
what *would* work and be very helpful IMO, is something similar to the
'std.conv.parse(...)' functions. Ie, just like this, but properly templated,
range-ified and UTF8/16-ified:

/// If 'str' starts with a Unix, Windows, Unicode or Mac9 newline, it is
/// removed from 'str' and returned. Otherwise, null is returned.
dstring parseNewline(ref dstring str)
{
if(str.empty)
return null;

dstring ret;

// Newlines are as defined in:
// http://www.unicode.org/reports/tr18/#Line_Boundaries
switch(str[0])
{
case '\r':
if(str.length  1  str[1] == '\n')
{
ret = str[0..2];
str = str[2..$];
break;
}
goto case;

case '\n':
case '\f':
case '\v':
case '\x85':
case paraSep:
case lineSep:
ret = str[0..1];
str = str[1..$];
break;

default:
ret = null;
break;
}

return ret;
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 9045] Feature request for std.asscii = function isNewline

2012-11-20 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=9045


monarchdo...@gmail.com changed:

   What|Removed |Added

 CC||monarchdo...@gmail.com


--- Comment #7 from monarchdo...@gmail.com 2012-11-20 05:42:25 PST ---
(In reply to comment #1)
 See representation on various systems:
 
 http://en.wikipedia.org/wiki/Newline
 
 In particular:
 On Unix, and Mac OS X: LF (1 char)
 On Windows: CR+LF (2 chars)

(In reply to comment #5)
 Technically speaking, if you don't know which type of line endings a file uses
 
 [SNIP]

Isn't the line ending a *file* totally irrelevant here? In the sense that it
is a nothing more than the system's *storage* format?

On my windows machine, the *strings* I manipulate don't have \r\n as a
newline, they have '\n'. That's the entire reason there is a rb and r
option when reading a file.

If you *do* have an \r\n in your stream, then either:
* You have an actual a '\r' in your stream, which is then followed by a new
line.
* You are actually erroneously manipulating a binary payload, which should be
of type ubyte[], and should not be using the std.ascii functions with it.

Under these circumstance, and following the unicode definition, I'd say:

return 0x0A = c  c = 0x0D;

Is not only correct (for ascii), but any attempt to parse more than 1 character
for this info would be incorrect...

PS: WTF is \u{D A}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 9045] Feature request for std.asscii = function isNewline

2012-11-20 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=9045



--- Comment #8 from Dmitry Olshansky dmitry.o...@gmail.com 2012-11-20 
12:13:48 PST ---
(In reply to comment #7)
 (In reply to comment #1)
  See representation on various systems:
  
  http://en.wikipedia.org/wiki/Newline
  
  In particular:
  On Unix, and Mac OS X: LF (1 char)
  On Windows: CR+LF (2 chars)
 
 (In reply to comment #5)
  Technically speaking, if you don't know which type of line endings a file 
  uses
  
  [SNIP]
 
 Isn't the line ending a *file* totally irrelevant here? In the sense that it
 is a nothing more than the system's *storage* format?
 

There is no system's encoding. It died and buried in the same toomb as FTP
ASCII mode long time ago. After all files are transfered in many different ways
expecting someone to transcode line-endings everywhere is plain impossible (you
don't always know the target system). So by the end of day reasonable programs
just deal with all the zoo of them.

 On my windows machine, the *strings* I manipulate don't have \r\n as a
 newline, they have '\n'. That's the entire reason there is a rb and r
 option when reading a file.

And I'd say rb option is a woefully broken thing. In fact putting \n does in
fact store \r\n in this mode. You are far safer with binary mode at least it's
WYSIWG.

 If you *do* have an \r\n in your stream, then either:
 * You have an actual a '\r' in your stream, which is then followed by a new
 line.

 Under these circumstance, and following the unicode definition, I'd say:
 
 return 0x0A = c  c = 0x0D;
 
 Is not only correct (for ascii), but any attempt to parse more than 1 
 character
 for this info would be incorrect...

No, no and no. It's the fact of life (or rather the standard) that \r\n is a
single entity. And it can't be parsed other the by looking at two characters
(or rather codepoints).

 
 PS: WTF is \u{D A}

It's 2 dchars : \r\n. That is 0x0D and 0x0A. They are being cute and use
flexible width syntax not the old ones: \u and \U.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 9045] Feature request for std.asscii = function isNewline

2012-11-20 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=9045



--- Comment #9 from Jonathan M Davis jmdavisp...@gmx.com 2012-11-20 12:29:09 
PST ---
Any code that only cares about \n or \r\n isn't going to work with a function 
which returns true for both. And any code that doesn't care is going to be
ill-served by a such a function, because the reality is that you need to watch
for \r and \n as individual characters and properly handle the cases when
they're separate as well as when they're apart. And the fact that one of them
is multiple characters generally screws with trying to check for them with a
single function in the middle of iterating. Rather, you have to watch for \r
and \n individually and then figure out whether you're dealing with them singly
or together and act appropriately. A function which returns true for a string
is generally useless for the kind of situations where you'd be checking for
newlines, so I'm highly skeptical that such a function is of any real value.
Instead, you end up with stuff like

for(auto range = str.save; !range.empty; range.popFront())
{
switch(range.front)
{
case '\r':
auto temp = range.save;
temp.popFront();
if(temp.front == '\n')
{
range.popFront();
goto case '\n';
}
goto default;
case '\n':
//do whatever you do for end of line
break;
default:
//do whatever you do for individual characters
break;
}
}

And if you all you want to know is whether a particular string starts or ends
with a newline, then it's easy enough to just do str.startsWith(\n, \r\n)
or str.endsWith(\n, \r\n). That gets uglier when you need to deal with
unicode rather than just \n an \r\n, but then all I believe that you really
need is to add [paraSep], and [lineSep] to the list.

I'm not sure that a function telling you whether a string designates the end of
a line is completely useless, but in pretty much every case that I can see code
caring, such a function wouldn't work very well.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 9045] Feature request for std.asscii = function isNewline

2012-11-19 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=9045


Dmitry Olshansky dmitry.o...@gmail.com changed:

   What|Removed |Added

 CC||dmitry.o...@gmail.com


--- Comment #6 from Dmitry Olshansky dmitry.o...@gmail.com 2012-11-19 
10:49:39 PST ---
Somewhat related.

Unicode new line sequence is defined as:
\u{A} | \u{B} | \u{C} | \u{D} | \u{85} | \u{2028} | \u{2029} | \u{D A}

Note that sequence '\r\n' counts as one line end.

See the note here for example:
http://www.unicode.org/reports/tr18/#Line_Boundaries

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 9045] Feature request for std.asscii = function isNewline

2012-11-18 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=9045


bearophile_h...@eml.cc changed:

   What|Removed |Added

 CC||bearophile_h...@eml.cc


--- Comment #1 from bearophile_h...@eml.cc 2012-11-18 16:59:06 PST ---
See representation on various systems:

http://en.wikipedia.org/wiki/Newline

In particular:
On Unix, and Mac OS X: LF (1 char)
On Windows: CR+LF (2 chars)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 9045] Feature request for std.asscii = function isNewline

2012-11-18 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=9045



--- Comment #2 from bioinfornatics bioinfornat...@gmail.com 2012-11-18 
17:08:08 PST ---
(In reply to comment #1)
 See representation on various systems:
 
 http://en.wikipedia.org/wiki/Newline
 
 In particular:
 On Unix, and Mac OS X: LF (1 char)
 On Windows: CR+LF (2 chars)

yes not easy or into string

bool isNewline(in dchar[] c) @safe pure nothrow {
bool result = false;
if( c[0] == 0x0A)
result = true;
else if( c.length = 2  c[0] == 0x0D  c[1] == 0x0A)
result = true;
...
...
return result;
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 9045] Feature request for std.asscii = function isNewline

2012-11-18 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=9045


Jonathan M Davis jmdavisp...@gmx.com changed:

   What|Removed |Added

 CC||jmdavisp...@gmx.com


--- Comment #3 from Jonathan M Davis jmdavisp...@gmx.com 2012-11-18 17:15:06 
PST ---
Just check whether it equals std.ascii.newline.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 9045] Feature request for std.asscii = function isNewline

2012-11-18 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=9045



--- Comment #4 from bioinfornatics bioinfornat...@gmail.com 2012-11-18 
23:19:27 PST ---
(In reply to comment #3)
 Just check whether it equals std.ascii.newline.

yes but the problem is when you need to write a parser and detect end of line.
You can get the given file from various operating system so you cant use
std.ascii.newline as is system specific.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 9045] Feature request for std.asscii = function isNewline

2012-11-18 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=9045



--- Comment #5 from Jonathan M Davis jmdavisp...@gmx.com 2012-11-18 23:28:15 
PST ---
Technically speaking, if you don't know which type of line endings a file uses,
you can't possibly correctly determine when you've reached the end of a line.
Best case, you have to assume that '\r\n' and '\n' both designate the end of
the line, whereas it's perfectly legal to have a '\r' be the last character on
a line (i.e. the one before the characters indicating the end of the line) in
Linux, and it's perfectly valid to have '\n' be in the middle of line on
Windows. So, parsing with the assumption that both '\r\n' and '\n' indicate the
end of the line is actually incorrect no matter what OS you're on.

That doesn't mean that it's not entirely unreasonable to have such a function,
but it does mean that it can't possibly be 100% correct.

On an unrelated note, I'd point out that

return ( c == 0x0A || c == 0x0D )? true : false;

is redundant. The ternary operator is completely unnecessary. It would be
better if it were

return c == 0x0A || c == 0x0D;

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---