Re: [NTG-context] new hash for buffer (as file)
Hi Hans, Pablo, > > But I do agree that the line ending handling seems a little odd. I find it > > surprising that the buffers internally use CR line endings since no systems > > in the past 20 years use that. > > how about tex ... > > \number\endlinechar > \number\numexpr`M-`A+1\relax % plain sets up `^^M Argh, how could I have forgotten about that. Yes, that makes complete sense. > > First, run "chcp 65001" before running "context" and record the size of the > > file written. Then, run "chcp 1251" and run "context" again. Hopefully the > > file size doesn't change; but if it does, then that means that the binary > > content of any file written will depend on the system's default code page, > > which would complicate making reproducible hashes. > > if that were the case nothing would work .. so it's bytes in - bytes out Ok good, that's what I was expecting. I've unfortunately used some programs that even fairly recently depended on the system code page, so I'm always a little cautious. > Hi Max, > > I realized later that I was doing something wrong. My fault here. Glad that you've figured it out. > I thought that ConTeXt would output the same character encoding as in > the source file when saving a buffer. Yes, Hans confirmed that that is correct. Thanks, -- Max ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context webpage : https://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : https://contextgarden.net ___
Re: [NTG-context] new hash for buffer (as file)
On 9/26/2022 7:24 PM, Pablo Rodriguez via ntg-context wrote: On 9/26/22 02:05, Max Chernoff via ntg-context wrote: Hi Pablo, But now I don’t understand is the following issue: if the saved file contains "\r\n", why does basic Notepad the new lines? "\r\n" are the chars to get new lines in Windows. Or what am I missing here? I'm not too sure what you're asking here, but Notepad was somewhat- recently updated to handle both CRLF and LF line endings: https://devblogs.microsoft.com/commandline/extended-eol-in-notepad/ Hi Max, I realized later that I was doing something wrong. My fault here. [...] Also, you should probably check to make sure that the results of the file don't depend on the current code page on Windows. Try writing out a buffer from ConTeXt with the following contents: АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюя First, run "chcp 65001" before running "context" and record the size of the file written. Then, run "chcp 1251" and run "context" again. Hopefully the file size doesn't change; but if it does, then that means that the binary content of any file written will depend on the system's default code page, which would complicate making reproducible hashes. For more than two decades, all my TeX sources are written in UTF-8. I thought that ConTeXt would output the same character encoding as in the source file when saving a buffer. I haven’t found this issue and I’d say that all my saved buffers are UTF-8 encoded. the magic is in savedata(name,replacenewlines(content),"\n",option == v_append) because tex reads in and then lost what it saw (cr lf crlf) we use the line endings of the operating system (good old typewriters and windows use cr+lf and old macs uses cr while linux uses lf) Hans - Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl - ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context webpage : https://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : https://contextgarden.net ___
Re: [NTG-context] new hash for buffer (as file)
On 9/26/22 02:05, Max Chernoff via ntg-context wrote: > > Hi Pablo, > >> But now I don’t understand is the following issue: if the saved file >> contains "\r\n", why does basic Notepad the new lines? >> >> "\r\n" are the chars to get new lines in Windows. Or what am I missing here? > > I'm not too sure what you're asking here, but Notepad was somewhat- > recently updated to handle both CRLF and LF line endings: > >https://devblogs.microsoft.com/commandline/extended-eol-in-notepad/ Hi Max, I realized later that I was doing something wrong. My fault here. > [...] > Also, you should probably check to make sure that the results of the > file don't depend on the current code page on Windows. Try writing out a > buffer from ConTeXt with the following contents: > >АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюя > > First, run "chcp 65001" before running "context" and record the size of the > file written. Then, run "chcp 1251" and run "context" again. Hopefully the > file size doesn't change; but if it does, then that means that the binary > content of any file written will depend on the system's default code page, > which would complicate making reproducible hashes. For more than two decades, all my TeX sources are written in UTF-8. I thought that ConTeXt would output the same character encoding as in the source file when saving a buffer. I haven’t found this issue and I’d say that all my saved buffers are UTF-8 encoded. Many thanks for your help, Pablo ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context webpage : https://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : https://contextgarden.net ___
Re: [NTG-context] new hash for buffer (as file)
On 9/26/2022 2:05 AM, Max Chernoff via ntg-context wrote: Hi Pablo, But now I don’t understand is the following issue: if the saved file contains "\r\n", why does basic Notepad the new lines? "\r\n" are the chars to get new lines in Windows. Or what am I missing here? I'm not too sure what you're asking here, but Notepad was somewhat- recently updated to handle both CRLF and LF line endings: https://devblogs.microsoft.com/commandline/extended-eol-in-notepad/ But I do agree that the line ending handling seems a little odd. I find it surprising that the buffers internally use CR line endings since no systems in the past 20 years use that. how about tex ... \number\endlinechar \number\numexpr`M-`A+1\relax % plain sets up `^^M ... you don't want to know how much hassle dealing with line endings in tex is Also, you should probably check to make sure that the results of the file don't depend on the current code page on Windows. Try writing out a buffer from ConTeXt with the following contents: АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюя First, run "chcp 65001" before running "context" and record the size of the file written. Then, run "chcp 1251" and run "context" again. Hopefully the file size doesn't change; but if it does, then that means that the binary content of any file written will depend on the system's default code page, which would complicate making reproducible hashes. if that were the case nothing would work .. so it's bytes in - bytes out Hans - Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl - ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context webpage : https://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : https://contextgarden.net ___
Re: [NTG-context] new hash for buffer (as file)
Hi Pablo, > But now I don’t understand is the following issue: if the saved file > contains "\r\n", why does basic Notepad the new lines? > > "\r\n" are the chars to get new lines in Windows. Or what am I missing here? I'm not too sure what you're asking here, but Notepad was somewhat- recently updated to handle both CRLF and LF line endings: https://devblogs.microsoft.com/commandline/extended-eol-in-notepad/ But I do agree that the line ending handling seems a little odd. I find it surprising that the buffers internally use CR line endings since no systems in the past 20 years use that. Also, you should probably check to make sure that the results of the file don't depend on the current code page on Windows. Try writing out a buffer from ConTeXt with the following contents: АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюя First, run "chcp 65001" before running "context" and record the size of the file written. Then, run "chcp 1251" and run "context" again. Hopefully the file size doesn't change; but if it does, then that means that the binary content of any file written will depend on the system's default code page, which would complicate making reproducible hashes. -- Max ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context webpage : https://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : https://contextgarden.net ___
Re: [NTG-context] new hash for buffer (as file)
On 9/23/22 17:06, Pablo Rodriguez via ntg-context wrote: > On 9/23/22 06:01, Max Chernoff via ntg-context wrote: >> […] >>return utilities.sha2.hash256( >>str:gsub(string.char(0x0D), string.char(0x0A)) >>) > […] > this works perfectly fine with Linux "str:gsub('\r','\n')", but I can’t > make it work in Windows. Hi again Max, this seems to solve the issue in Windows too: \startbuffer[test] just a test and another one \stopbuffer \starttext \startluacode require("util-sha") function sha256(str) if os.name == "windows" then return utilities.sha2.hash256(str:gsub("\r", "\r\n")) else return utilities.sha2.hash256(str:gsub("\r", "\n")) end end \stopluacode \def\shabuffer#1% {\cldcontext{sha256(buffers.raw("#1"))}} \def\shafile#1% {\cldcontext{utilities.sha2.hash256(io.loaddata("#1"))}} \shabuffer{test} \savebuffer[test][temporary-αβγ, prefix=no] \shafile{temporary-αβγ} \stoptext But now I don’t understand is the following issue: if the saved file contains "\r\n", why does basic Notepad the new lines? "\r\n" are the chars to get new lines in Windows. Or what am I missing here? Many thanks for your help, Pablo ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context webpage : https://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : https://contextgarden.net ___
Re: [NTG-context] new hash for buffer (as file)
On 9/23/22 06:01, Max Chernoff via ntg-context wrote: > […] > The SHA calculation isn't working properly because of a weird newline > issue. Try this: > […] >function sha256(str) >return utilities.sha2.hash256( >str:gsub(string.char(0x0D), string.char(0x0A)) >) >end > […] Hi Max, this works perfectly fine with Linux "str:gsub('\r','\n')", but I can’t make it work in Windows. I always thought that Unix used LF (\n, if I’m not wrong) to mark a new line, and Windows used CRLF (\r\n). How are new lines marked in the buffer? As \r instead of \r\n or \n? At least, Notepad (the minimal plain text editor in Windows) doesn’t recognize newlines if I attach the buffer to the PDF document as a .txt file. Many thanks for your help, Pablo ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context webpage : https://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : https://contextgarden.net ___
Re: [NTG-context] new hash for buffer (as file)
Hi Pablo, > I mean, to get hash of the file attached to the document, I need to save > the buffer for "context(utilities.sha2.hash256(io.loaddata(buffer)))". > > But I don’t need to save the buffer to attach it to the PDF document. > > My question is how to define \shabufferfile to avoid \savebuffer (only > required to get the hash). The SHA calculation isn't working properly because of a weird newline issue. Try this: \setupinteraction[state=start] \setupinteractionscreen[option={attachment}] \startbuffer[test] just a test and another one \stopbuffer \starttext \startluacode require("util-sha") function sha256(str) return utilities.sha2.hash256( str:gsub(string.char(0x0D), string.char(0x0A)) ) end \stopluacode \def\shabuffer#1% {\cldcontext{sha256(buffers.raw("#1"))}} \def\shafile#1% {\cldcontext{sha256(io.loaddata("#1"))}} \shabuffer{test} \savebuffer[test][temporary-αβγ, prefix=no] \shafile{temporary-αβγ} \attachment[buffer=test, name=\shabuffer{test}, method=hidden] \stoptext You can remove the "\savebuffer" and the "\shafile"; I just kept that in to show that the two hashes are now the same. -- Max > ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context webpage : https://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : https://contextgarden.net ___