Re: Reading files using delimiters/terminators

2020-12-30 Thread Rekel via Digitalmars-d-learn
On Tuesday, 29 December 2020 at 14:50:41 UTC, Steven 
Schveighoffer wrote:
Are you on Windows? If so, your double newlines might be 
\r\n\r\n, depending on what editor you used to create the 
input. Use a hexdump program to see what the newlines are in 
your input file.


I've tried \r\n\r\n as well, which sadly also did not work.
Using vscode I have also switched between CRLF and LF, which also 
did not do the trick.
I'm getting the sense the implementation might have a specific 
workaround for \r\n / CRLF line-endings, though I haven't checked 
the sourcecode yet.


Note that this is not really a problem for me specifically, I've 
long used a different approach, however it seemed like a design 
issue. I'll try replicating this in isolation later, maybe 
something was wrong last time I tried.


Re: Reading files using delimiters/terminators

2020-12-29 Thread Steven Schveighoffer via Digitalmars-d-learn

On 12/26/20 7:13 PM, Rekel wrote:
I'm trying to read a file with entries seperated by '\n\n' (empty line), 
with entries containing '\n'. I thought the 
File.readLine(KeepTerminator, Terminator) might work, as it seems to 
accept strings as terminators, since there seems to have been a thread 
regarding '\r\n' seperators.


I don't know if there's some underlying reason, but when I try to use 
"\n\n" as a terminator, I end up getting the entire file into 1 char[], 
so it's not delimited.


Should this work or is there a reason one cannot use byLine like this?

For context, I'm trying this with the puzzle input of day 6 of this 
year's advent of code. (https://adventofcode.com/)


Are you on Windows? If so, your double newlines might be \r\n\r\n, 
depending on what editor you used to create the input. Use a hexdump 
program to see what the newlines are in your input file.


Now, you would think that the underlying C stream would do this for you. 
I'm not sure how it works exactly, as I don't use Windows.


-Steve


Re: Reading files using delimiters/terminators

2020-12-28 Thread Rekel via Digitalmars-d-learn

http://ddili.org/ders/d.en/index.html


This seems very promising :)
I doubt I'd still be considering D if it weren't for this awesome 
learning forum, thanks for all the help!


Re: Reading files using delimiters/terminators

2020-12-27 Thread Mike Parker via Digitalmars-d-learn

On Sunday, 27 December 2020 at 23:18:37 UTC, Rekel wrote:


Update;
Any clue why there's both "std.file" and "std.io.File"?
I was mostly unaware of the former.


The very first paragraph at the top of the `std.file` 
documentation explains it:


"Functions in this module handle files as a unit, e.g., read or 
write one file at a time. For opening files and manipulating them 
via handles refer to module std.stdio."


https://dlang.org/phobos/std_file.html


Re: Reading files using delimiters/terminators

2020-12-27 Thread oddp via Digitalmars-d-learn

On 28.12.20 00:12, Rekel via Digitalmars-d-learn wrote:

is there a reason to use either 'splitter' or 'split'?


split gives you a newly allocated array with the results, splitter is lazy equivalent and doesn't 
allocate. Feel free using either, doesn't matter much with these small puzzle inputs.


Sidetangent, don't mean to bash the learning tour, as it's been really useful for getting started, 
but I'm surprised stuff like tuples and files arent mentioned there.
Especially since the documentation tends to trip me up, with stuff like 'isSomeString' mentioning 
'built in string types', while I haven't been able to find that concept elsewhere, let alone 
functionality one can expect in this case (like .length and the like), and stuff like 'countUntil' 
not being called 'indexOf', although it also exists and does basically the same thing. Also 
assumeUnique seems to be a thing?


Might be worth discussing that in a new topic. The stdlib is vast and has tons of useful utilities, 
not all of which can be explained in detail in a series of overview posts. Ali's "Programming in D" 
[1], which has a free online version, functions as an excellent in-depth introduction to the 
language, going over all the important topics.


Regarding function names and docs: Yes, some might seem slightly off coming from other languages 
(e.g. find vs. dropWhile, until vs. takeWhile, cumulativeFold vs scan/accumulate, etc.), but it's 
all in there somewhere, implemented with the most care to not waste precious cycles. Might makes it 
harder to grok going over the implementation or docs for very the first time, but it gets easier 
after a while. Furthermore, alternative names are often times mentioned in the docs so a quick 
google search should bring you to the right place.


[1] http://ddili.org/ders/d.en/index.html


Re: Reading files using delimiters/terminators

2020-12-27 Thread Ali Çehreli via Digitalmars-d-learn

On 12/27/20 3:12 PM, Rekel wrote:

> is there a reason to use
> either 'splitter' or 'split'? I'm not sure I see why the difference
> would matter in the end.

splitter() is a lazy range algorithm. split() is a range algorithm as 
well but it is eager; it will put the results in an array that it grows. 
The string elements would not be copies of the original range; they will 
still be just the pair of .ptr and .length but it can be expensive if 
there are a lot of parts.


Further, if you want to process just a small number of the initial 
parts, then being eager would be wasteful.


As all lazy range algorithms, splitter() is just an iteration object 
waiting to be used. It does not allocate any array but serves the parts 
one by one. You can filter the parts as you iterate over or you can stop 
at any point. For example, the following would take the first 3 
non-empty lines:


import std.stdio;
import std.range;
import std.algorithm;

void main() {
  auto s = "hello\n\nworld\n\n\nand\nmoon";
  writefln!"%(%s, %)"(s.splitter('\n').filter!(part => 
!part.empty).take(3));

}

> Sidetangent, don't mean to bash the learning tour, as it's been really
> useful for getting started, but I'm surprised stuff like tuples and
> files arent mentioned there.

Alternative place to search: :)

  http://ddili.org/ders/d.en/ix.html

> Especially since the documentation tends to trip me up, with stuff like
> 'isSomeString' mentioning 'built in string types', while I haven't been
> able to find that concept elsewhere,

Built in strings are just arrays of character types: char[], wchar[], 
and dchar[]. Commonly used by their respective immutable aliases: 
string, wstring, and dstring.


> 'countUntil' not being called 'indexOf'

countUntil() is more general because it works with any range while 
indexOf requires a string.


> assumeUnique seems to be a thing?

That appears in the index I posted above as well. ;)

Ali



Re: Reading files using delimiters/terminators

2020-12-27 Thread Rekel via Digitalmars-d-learn

On Sunday, 27 December 2020 at 23:12:46 UTC, Rekel wrote:
Sidetangent, don't mean to bash the learning tour, as it's been 
really useful for getting started, but I'm surprised stuff like 
tuples and files arent mentioned there.


Update;
Any clue why there's both "std.file" and "std.io.File"?
I was mostly unaware of the former.


Re: Reading files using delimiters/terminators

2020-12-27 Thread Rekel via Digitalmars-d-learn

On Sunday, 27 December 2020 at 13:27:49 UTC, oddp wrote:

foreach (group; readText("input").splitter("\n\n")) { ... }


Also, on other days, when the input is more uniform, there's 
always https://dlang.org/library/std/file/slurp.html which 
makes reading it in even easier, e.g. day02:


alias Record = Tuple!(int, "low", int, "high", char, "needle", 
string, "hay");

auto input = slurp!Record("input", "%d-%d %s: %s");

P.S.: would've loved to have had multiwayIntersection in the 
stdlib for day06 part2, especially when there's already 
multiwayUnion in setops. fold!setIntersection felt a bit clunky.


Oh my, all these things are new to me, haha, thanks a lot! I'll 
be looking into those (slurp & tuple). By the way, is there a 
reason to use either 'splitter' or 'split'? I'm not sure I see 
why the difference would matter in the end.


Sidetangent, don't mean to bash the learning tour, as it's been 
really useful for getting started, but I'm surprised stuff like 
tuples and files arent mentioned there.
Especially since the documentation tends to trip me up, with 
stuff like 'isSomeString' mentioning 'built in string types', 
while I haven't been able to find that concept elsewhere, let 
alone functionality one can expect in this case (like .length and 
the like), and stuff like 'countUntil' not being called 
'indexOf', although it also exists and does basically the same 
thing. Also assumeUnique seems to be a thing?


Re: Reading files using delimiters/terminators

2020-12-27 Thread Jesse Phillips via Digitalmars-d-learn

On Sunday, 27 December 2020 at 13:21:44 UTC, Rekel wrote:
On Sunday, 27 December 2020 at 02:41:12 UTC, Jesse Phillips 
wrote:
Unfortunately std.csv is character based and not string. 
https://dlang.org/phobos/std_csv.html#.csvReader


But your use case sounds like splitter is more aligned with 
your needs.


https://dlang.org/phobos/std_algorithm_iteration.html#.splitter


But I'm not using csv right? Additionally, shouldnt byLine also 
work with "\r\n"?


Right, you weren't using csv. I'm not familiar with the file 
terminater to known why it didn't work.


byline would allow \r\n as well as \n


Re: Reading files using delimiters/terminators

2020-12-27 Thread oddp via Digitalmars-d-learn

On 27.12.20 01:13, Rekel via Digitalmars-d-learn wrote:
For context, I'm trying this with the puzzle input of day 6 of this year's advent of code. 
(https://adventofcode.com/)


For that specific puzzle I simply did:

foreach (group; readText("input").splitter("\n\n")) { ... }

Since the input is never that big, I prefer reading in the whole thing and then 
do the processing.

Also, on other days, when the input is more uniform, there's always 
https://dlang.org/library/std/file/slurp.html which makes reading it in even easier, e.g. day02:


alias Record = Tuple!(int, "low", int, "high", char, "needle", string, "hay");
auto input = slurp!Record("input", "%d-%d %s: %s");

P.S.: would've loved to have had multiwayIntersection in the stdlib for day06 part2, especially when 
there's already multiwayUnion in setops. fold!setIntersection felt a bit clunky.


Re: Reading files using delimiters/terminators

2020-12-27 Thread Rekel via Digitalmars-d-learn

On Sunday, 27 December 2020 at 02:41:12 UTC, Jesse Phillips wrote:
Unfortunately std.csv is character based and not string. 
https://dlang.org/phobos/std_csv.html#.csvReader


But your use case sounds like splitter is more aligned with 
your needs.


https://dlang.org/phobos/std_algorithm_iteration.html#.splitter


But I'm not using csv right? Additionally, shouldnt byLine also 
work with "\r\n"?


Re: Reading files using delimiters/terminators

2020-12-26 Thread Ali Çehreli via Digitalmars-d-learn

On 12/26/20 4:13 PM, Rekel wrote:
I'm trying to read a file with entries seperated by '\n\n' (empty line), 
with entries containing '\n'. I thought the 
File.readLine(KeepTerminator, Terminator) might work, as it seems to 
accept strings as terminators, since there seems to have been a thread 
regarding '\r\n' seperators.


I don't know if there's some underlying reason, but when I try to use 
"\n\n" as a terminator, I end up getting the entire file into 1 char[], 
so it's not delimited.


Should this work or is there a reason one cannot use byLine like this?

For context, I'm trying this with the puzzle input of day 6 of this 
year's advent of code. (https://adventofcode.com/)


byLine should work:

import std.stdio;

void main() {
  auto f = File("deneme.d");

  // Warning: byLine reuses an internal buffer. Call byLineCopy
  // if potentially parsed strings into the line need to persist.
  foreach (line; f.byLine) {
if (line.length == 0) {
  writeln("EMPTY LINE");

} else {
  writeln(line);
}
  }
}

Ali



Re: Reading files using delimiters/terminators

2020-12-26 Thread Jesse Phillips via Digitalmars-d-learn

On Sunday, 27 December 2020 at 00:13:30 UTC, Rekel wrote:
I'm trying to read a file with entries seperated by '\n\n' 
(empty line), with entries containing '\n'. I thought the 
File.readLine(KeepTerminator, Terminator) might work, as it 
seems to accept strings as terminators, since there seems to 
have been a thread regarding '\r\n' seperators.


I don't know if there's some underlying reason, but when I try 
to use "\n\n" as a terminator, I end up getting the entire file 
into 1 char[], so it's not delimited.


Should this work or is there a reason one cannot use byLine 
like this?


For context, I'm trying this with the puzzle input of day 6 of 
this year's advent of code. (https://adventofcode.com/)


Unfortunately std.csv is character based and not string. 
https://dlang.org/phobos/std_csv.html#.csvReader


But your use case sounds like splitter is more aligned with your 
needs.


https://dlang.org/phobos/std_algorithm_iteration.html#.splitter


Reading files using delimiters/terminators

2020-12-26 Thread Rekel via Digitalmars-d-learn
I'm trying to read a file with entries seperated by '\n\n' (empty 
line), with entries containing '\n'. I thought the 
File.readLine(KeepTerminator, Terminator) might work, as it seems 
to accept strings as terminators, since there seems to have been 
a thread regarding '\r\n' seperators.


I don't know if there's some underlying reason, but when I try to 
use "\n\n" as a terminator, I end up getting the entire file into 
1 char[], so it's not delimited.


Should this work or is there a reason one cannot use byLine like 
this?


For context, I'm trying this with the puzzle input of day 6 of 
this year's advent of code. (https://adventofcode.com/)