Re: [NTG-context] new hash for buffer (as file)

2022-09-26 Thread Max Chernoff via ntg-context
Hi Hans, Pablo,


> > But I do agree that the line ending handling seems a little odd. I find it
> > surprising that the buffers internally use CR line endings since no systems
> > in the past 20 years use that.
> 
> how about tex ...
> 
> \number\endlinechar
> \number\numexpr`M-`A+1\relax % plain sets up `^^M

Argh, how could I have forgotten about that. Yes, that makes complete
sense.

> > First, run "chcp 65001" before running "context" and record the size of the
> > file written. Then, run "chcp 1251" and run "context" again. Hopefully the
> > file size doesn't change; but if it does, then that means that the binary
> > content of any file written will depend on the system's default code page,
> > which would complicate making reproducible hashes.
>
> if that were the case nothing would work .. so it's bytes in - bytes out

Ok good, that's what I was expecting. I've unfortunately used some
programs that even fairly recently depended on the system code page, so
I'm always a little cautious.

> Hi Max,
> 
> I realized later that I was doing something wrong. My fault here.

Glad that you've figured it out.

> I thought that ConTeXt would output the same character encoding as in
> the source file when saving a buffer.

Yes, Hans confirmed that that is correct. 

Thanks,
-- Max


___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : https://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : https://contextgarden.net
___


Re: [NTG-context] new hash for buffer (as file)

2022-09-26 Thread Hans Hagen via ntg-context

On 9/26/2022 7:24 PM, Pablo Rodriguez via ntg-context wrote:

On 9/26/22 02:05, Max Chernoff via ntg-context wrote:


Hi Pablo,


But now I don’t understand is the following issue: if the saved file
contains "\r\n", why does basic Notepad the new lines?

"\r\n" are the chars to get new lines in Windows. Or what am I missing here?


I'm not too sure what you're asking here, but Notepad was somewhat-
recently updated to handle both CRLF and LF line endings:

https://devblogs.microsoft.com/commandline/extended-eol-in-notepad/


Hi Max,

I realized later that I was doing something wrong. My fault here.


[...]
Also, you should probably check to make sure that the results of the
file don't depend on the current code page on Windows. Try writing out a
buffer from ConTeXt with the following contents:

АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюя

First, run "chcp 65001" before running "context" and record the size of the
file written. Then, run "chcp 1251" and run "context" again. Hopefully the
file size doesn't change; but if it does, then that means that the binary
content of any file written will depend on the system's default code page,
which would complicate making reproducible hashes.


For more than two decades, all my TeX sources are written in UTF-8.

I thought that ConTeXt would output the same character encoding as in
the source file when saving a buffer.

I haven’t found this issue and I’d say that all my saved buffers are
UTF-8 encoded.

the magic is in

savedata(name,replacenewlines(content),"\n",option == v_append)

because tex reads in and then lost what it saw (cr lf crlf) we use the 
line endings of the operating system (good old typewriters and windows 
use cr+lf and old macs uses cr while linux uses lf)


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : https://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : https://contextgarden.net
___


Re: [NTG-context] new hash for buffer (as file)

2022-09-26 Thread Pablo Rodriguez via ntg-context
On 9/26/22 02:05, Max Chernoff via ntg-context wrote:
>
> Hi Pablo,
>
>> But now I don’t understand is the following issue: if the saved file
>> contains "\r\n", why does basic Notepad the new lines?
>>
>> "\r\n" are the chars to get new lines in Windows. Or what am I missing here?
>
> I'm not too sure what you're asking here, but Notepad was somewhat-
> recently updated to handle both CRLF and LF line endings:
>
>https://devblogs.microsoft.com/commandline/extended-eol-in-notepad/

Hi Max,

I realized later that I was doing something wrong. My fault here.

> [...]
> Also, you should probably check to make sure that the results of the
> file don't depend on the current code page on Windows. Try writing out a
> buffer from ConTeXt with the following contents:
>
>АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюя
>
> First, run "chcp 65001" before running "context" and record the size of the
> file written. Then, run "chcp 1251" and run "context" again. Hopefully the
> file size doesn't change; but if it does, then that means that the binary
> content of any file written will depend on the system's default code page,
> which would complicate making reproducible hashes.

For more than two decades, all my TeX sources are written in UTF-8.

I thought that ConTeXt would output the same character encoding as in
the source file when saving a buffer.

I haven’t found this issue and I’d say that all my saved buffers are
UTF-8 encoded.

Many thanks for your help,

Pablo
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : https://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : https://contextgarden.net
___


Re: [NTG-context] new hash for buffer (as file)

2022-09-26 Thread Hans Hagen via ntg-context

On 9/26/2022 2:05 AM, Max Chernoff via ntg-context wrote:


Hi Pablo,


But now I don’t understand is the following issue: if the saved file
contains "\r\n", why does basic Notepad the new lines?

"\r\n" are the chars to get new lines in Windows. Or what am I missing here?


I'm not too sure what you're asking here, but Notepad was somewhat-
recently updated to handle both CRLF and LF line endings:

https://devblogs.microsoft.com/commandline/extended-eol-in-notepad/

But I do agree that the line ending handling seems a little odd. I find it

surprising that the buffers internally use CR line endings since no systems
in the past 20 years use that.


how about tex ...

\number\endlinechar
\number\numexpr`M-`A+1\relax % plain sets up `^^M

... you don't want to know how much hassle dealing with line endings in 
tex is



Also, you should probably check to make sure that the results of the
file don't depend on the current code page on Windows. Try writing out a
buffer from ConTeXt with the following contents:

АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюя

First, run "chcp 65001" before running "context" and record the size of the

file written. Then, run "chcp 1251" and run "context" again. Hopefully the
file size doesn't change; but if it does, then that means that the binary
content of any file written will depend on the system's default code page,
which would complicate making reproducible hashes.

if that were the case nothing would work .. so it's bytes in - bytes out

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : https://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : https://contextgarden.net
___


Re: [NTG-context] new hash for buffer (as file)

2022-09-25 Thread Max Chernoff via ntg-context

Hi Pablo,

> But now I don’t understand is the following issue: if the saved file
> contains "\r\n", why does basic Notepad the new lines?
> 
> "\r\n" are the chars to get new lines in Windows. Or what am I missing here?

I'm not too sure what you're asking here, but Notepad was somewhat-
recently updated to handle both CRLF and LF line endings:

   https://devblogs.microsoft.com/commandline/extended-eol-in-notepad/
   
But I do agree that the line ending handling seems a little odd. I find it
surprising that the buffers internally use CR line endings since no systems
in the past 20 years use that. 

Also, you should probably check to make sure that the results of the
file don't depend on the current code page on Windows. Try writing out a
buffer from ConTeXt with the following contents:

   АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюя
   
First, run "chcp 65001" before running "context" and record the size of the
file written. Then, run "chcp 1251" and run "context" again. Hopefully the
file size doesn't change; but if it does, then that means that the binary
content of any file written will depend on the system's default code page,
which would complicate making reproducible hashes.
   
-- Max
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : https://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : https://contextgarden.net
___


Re: [NTG-context] new hash for buffer (as file)

2022-09-25 Thread Pablo Rodriguez via ntg-context
On 9/23/22 17:06, Pablo Rodriguez via ntg-context wrote:
> On 9/23/22 06:01, Max Chernoff via ntg-context wrote:
>> […]
>>return utilities.sha2.hash256(
>>str:gsub(string.char(0x0D), string.char(0x0A))
>>)
> […]
> this works perfectly fine with Linux "str:gsub('\r','\n')", but I can’t
> make it work in Windows.

Hi again Max,

this seems to solve the issue in Windows too:

  \startbuffer[test]
  just a test
  and another one
  \stopbuffer

  \starttext
  \startluacode
  require("util-sha")

  function sha256(str)
if os.name == "windows" then
  return utilities.sha2.hash256(str:gsub("\r", "\r\n"))
else
  return utilities.sha2.hash256(str:gsub("\r", "\n"))
end
  end
  \stopluacode

  \def\shabuffer#1%
{\cldcontext{sha256(buffers.raw("#1"))}}

  \def\shafile#1%
{\cldcontext{utilities.sha2.hash256(io.loaddata("#1"))}}

  \shabuffer{test}

  \savebuffer[test][temporary-αβγ, prefix=no]

  \shafile{temporary-αβγ}

  \stoptext

But now I don’t understand is the following issue: if the saved file
contains "\r\n", why does basic Notepad the new lines?

"\r\n" are the chars to get new lines in Windows. Or what am I missing here?

Many thanks for your help,

Pablo
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : https://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : https://contextgarden.net
___


Re: [NTG-context] new hash for buffer (as file)

2022-09-23 Thread Pablo Rodriguez via ntg-context
On 9/23/22 06:01, Max Chernoff via ntg-context wrote:
> […]
> The SHA calculation isn't working properly because of a weird newline
> issue. Try this:
> […]
>function sha256(str)
>return utilities.sha2.hash256(
>str:gsub(string.char(0x0D), string.char(0x0A))
>)
>end
> […]

Hi Max,

this works perfectly fine with Linux "str:gsub('\r','\n')", but I can’t
make it work in Windows.

I always thought that Unix used LF (\n, if I’m not wrong) to mark a new
line, and Windows used CRLF (\r\n).

How are new lines marked in the buffer? As \r instead of \r\n or \n?

At least, Notepad (the minimal plain text editor in Windows) doesn’t
recognize newlines if I attach the buffer to the PDF document as a .txt
file.

Many thanks for your help,

Pablo
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : https://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : https://contextgarden.net
___


Re: [NTG-context] new hash for buffer (as file)

2022-09-22 Thread Max Chernoff via ntg-context
Hi Pablo,

> I mean, to get hash of the file attached to the document, I need to save
> the buffer for "context(utilities.sha2.hash256(io.loaddata(buffer)))".
> 
> But I don’t need to save the buffer to attach it to the PDF document.
> 
> My question is how to define \shabufferfile to avoid \savebuffer (only
> required to get the hash).

The SHA calculation isn't working properly because of a weird newline
issue. Try this:

   \setupinteraction[state=start]
   \setupinteractionscreen[option={attachment}]
   
   \startbuffer[test]
   just a test
   and another one
   \stopbuffer
   
   \starttext
   \startluacode
   require("util-sha")
   
   function sha256(str)
   return utilities.sha2.hash256(
   str:gsub(string.char(0x0D), string.char(0x0A))
   )
   end
   \stopluacode
   
   \def\shabuffer#1%
   {\cldcontext{sha256(buffers.raw("#1"))}}
   
   \def\shafile#1%
   {\cldcontext{sha256(io.loaddata("#1"))}}
   
   \shabuffer{test}
   
   \savebuffer[test][temporary-αβγ, prefix=no]
   
   \shafile{temporary-αβγ}
   
   \attachment[buffer=test, name=\shabuffer{test}, method=hidden]
   \stoptext
   
You can remove the "\savebuffer" and the "\shafile"; I just kept that in to
show that the two hashes are now the same.

-- Max
> 
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : https://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : https://contextgarden.net
___


[NTG-context] new hash for buffer (as file)

2022-09-22 Thread Pablo Rodriguez via ntg-context
Dear list,

playing with buffer contents, I have the following file:

  \setupinteraction[state=start]
  \setupinteractionscreen[option={attachment}]

  \startbuffer[test]
  just a test
  and another one
  \stopbuffer

  \starttext
  \ctxlua{require("util-sha")}

  \def\shabuffer#1%
{\cldcontext{utilities.sha2.hash256(buffers.raw("#1"))}}

  \def\shafile#1%
{\cldcontext{utilities.sha2.hash256(io.loaddata("#1"))}}

  \def\shabufferfile#1%
{\cldcontext{utilities.sha2.hash256(buffers.raw("#1"))}}

  \shabuffer{test}

  \savebuffer[test][temporary-αβγ, prefix=no]

  \shafile{temporary-αβγ}

  \attachment[buffer=test, name=\shabufferfile{test}, method=hidden]
  \stoptext

I mean, to get hash of the file attached to the document, I need to save
the buffer for "context(utilities.sha2.hash256(io.loaddata(buffer)))".

But I don’t need to save the buffer to attach it to the PDF document.

My question is how to define \shabufferfile to avoid \savebuffer (only
required to get the hash).

An approach would be the following one. If I’m not totally wrong,
"savebuffer"
(https://github.com/contextgarden/context/blob/main/tex/context/base/mkxl/buff-ini.lmt#L559)
may be just replacing new lines with "\n" in the original buffer
(https://github.com/contextgarden/context/blob/main/tex/context/base/mkxl/buff-ini.lmt#L576).

The function string.replacenewlines() is defined at
https://github.com/contextgarden/context/blob/main/tex/context/base/mkiv/util-str.lua#L1475.

If I’m not totally wrong about savebuffer replacing newlines with "\n",
I wonder how to create a temporary buffer with such a replacement, so
that it could be hashed later.

I hope my question is clear.

Many thanks in advance for your help,

Pablo
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : https://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : https://contextgarden.net
___