If by "non-trivial" you mean "not so simple", then read on:
As others also did with youtube-dl, I decided to take a look on how ViewTube does the things. I decided to do so with ViewTube because I'm somewhat more used to JavaScript (the language in which ViewTube is written) than Python (the language in which youtube-dl is written). For ViewTube, if the video "reference" (not the URL) has "&s=", then the text of the possibly-non-free code that is in "base.js" file is read through the ytDecryptFunction() and made into the ytDecryptSignature() function. However, the file itself is sent as text, not executed by ViewTube. The result of ytDecryptFunction() (that is: ytDecryptSignature()) is very short, but it's still a JavaScript code, although it apparently is only a series of variable declarations with math, but no conditionals, no loops, and no breaks. To understand how ytDecryptSignature() function comes to existance, I decided to test with a video that is known to trigger the "signature decryption". The video is: <https://www.youtube.com/watch?v=ghQvZ9IID2A> I downloaded its "base.js" that, at least for my case, is located at: <https://www.youtube.com/yts/jsbin/player-vfleGnGfg/pt_BR/base.js> In order to make ytDecryptSignature(), the ytDecryptFunction() does as follows: 1. Declares the variables it will use. Nothing unusual so far. 2. Takes the "base.js" text and removes all line breaks. Id. 3. In the new "base.js", look for function name that is used to decrypt things. For this it makes use of the /"signature"\s*,\s*([^\)]*?)\(/ regular expression. I confess that this one puzzled me for some time, specially the "([^\)]*?)\(" part, I'll explain more: - The '"signature"\s*,\s*' part will match: '"signature"' followed by zero or more space-like characters, followed by one comma, followed by zero or more space-like characters. So far this is OK. - The "([^\)]*?)\(" part, will capture a sequence of characters that doesn't have a closing parenthesis ("([^\)]*?)" part) as long as such sequence is immediatelly before an opening parenthesis ("\(" part). The "*?" in the capture, and the "\(" after it will make sure that the capture always matches the shortest match, so it would match "RE" in "RE(c" and "exampleA" in "exampleA(argumentA) = { if (". I do find it risky that they are using a rule such as "every character except closing parenthesis", but thankfully, almost every JavaScript procedure/command/function requires a pair of parenthesis. 4. Now that it has the function name, it builds a regular expression to match the function itself: - If the function name has a "$", escape it. - Append "\\s*=\\s*function\\s*" to the regular expression. The "\\s*" will match any number of space-like characters. The rest is taken literally. - Append "\\s*\\(\\w+\\)\\s*\\{(.*?)\\}" to the regular expression. "\\(" and "\\)" will match parenthesis literally. Same for "\\{" and "\\}". The "\\w+" matches a sequence of at least one alphanumeric and underscore characters. "(.*?)" is a capture that will match the shortest sequence of any characters as long as it's between "{" and "}". I'd like to point out that "\\{(.*?)\\}" could match almost anything from conditionals, function calls, and also pairs of parenthesis. However, it will stop in the middle if it finds a "}". 5. With this regular expression finally ready, ViewTube simply uses it to find the function in the modified "base.js". 6. ViewTube then travels through the modified "base.js" file so that it finds each part of the main function and copies the custom variables and functions it depends on, so as to make the result "self-contained". 7. Them it makes sure that the main definion is prefixed with "var " to declare it as variable, and is between "try {" and "} catch(e) { return null }". After this, ViewTube makes a new function out of the result and makes sure that the first argument passed to it is inside the function-local "a" variable. This "a" variable, in ViewTube is a series of characters that stops before an ampersand (&) or line ending ($) (this is controlled by the "&s=(.*?)(&|$)" match part). The result is similar to this (with functions inside): try {var yE={CT:function(a,b){a.splice(0,b)},Jj:function(a,b){var c=a[0];a[0]=a[b%a.length];a[b]=c},c1:function(a){a.reverse()}};a=a.split("");yE.CT(a,3);yE.Jj(a,51);yE.Jj(a,36);yE.CT(a,3);return a.join("")} catch(e) {return null} Which can be beautified as: --8<---------------cut here---------------start------------->8--- try { var yE = { CT : function(a, b) { a.splice(0, b) }, Jj : function(a, b) { var c = a[0]; a[0] = a[b % a.length]; a[b] = c }, c1 : function(a) { a.reverse() } }; a = a.split(""); yE.CT(a, 3); yE.Jj(a, 51); yE.Jj(a, 36); yE.CT(a, 3); return a.join("") } catch(e) { return null } --8<---------------cut here---------------end--------------->8--- However, none of these variations make use of "if", "case" and other loops, except for the calls between themselves. Let's take the following video signature as an example: A84A842840706E172FA34F8E6096BBE8300549E2AA8AA4.31ADA18F942258766E20BFBEE4582464226B277C This would be the "s" value in the video "reference" and the value of the "a" variable in the ytDecryptSignature(). 1. 'a = a.split("");' makes each character as an element/key of a new array. That is: a[0] = "A", a[1] = "8" and so on. 2. "a.splice(0, 3)" *removes items from 0 to 2. So removing the first 3 items. Now, "a" is: A842840706E172FA34F8E6096BBE8300549E2AA8AA4.31ADA18F942258766E20BFBEE4582464226B277C 3. "var c = a[0];" takes the first array element/key's value, "A" and saves it in "c" variable. "a[0] = a[51 % a.length];" takes the length of "a" (length: 84), takes the reminder of 51/84 (reminder: 51) and takes the value of the 51 element/key from "a" to put it as the value of the 0 element/key of "a". "a[51] = c" (51 here *isn't* the reminder) sets the value of 51 element/key of "a" to the previously known value of a[0]. Now the value of "a" is: F842840706E172FA34F8E6096BBE8300549E2AA8AA4.31ADA18A942258766E20BFBEE4582464226B277C 4. Do (3) again, but using the number 36 as array element key reference and as dividend for the remainder calculations. Now the value of "a" is: 2842840706E172FA34F8E6096BBE8300549EFAA8AA4.31ADA18A942258766E20BFBEE4582464226B277C 5. "a.splice(0, 3)" removes the first three elements of the "a" array again. Now the value of "a" is: 2840706E172FA34F8E6096BBE8300549EFAA8AA4.31ADA18A942258766E20BFBEE4582464226B277C 6. 'return a.join("")' simply joins the "a" array into a string again, and returns it as the result of the function. Now, as to "how often" it is non-trivial, that depends on how YouTube decides to make the video available. Others in this thread and also in the one in the Trisquel forum for the English speakers have pointed out that this signature is common in videos which contain content under the standard copyright license, any music video, or any "VEVO" video. Searching for "youtube copyright statistics" results in no official document so far. So, since there's no official documentation of the copyright statistics of videos in YouTube, and we are dealing with non-free software, I *think* this is mostly (75%) non-trivial. Of course, I might be wrong. :) Ineiev <[email protected]> writes: > Hello, > > > How often is that code non-trivial? > -- - https://libreplanet.org/wiki/User:Adfeno - Palestrante e consultor sobre /software/ livre (não confundir com gratis). - "WhatsApp"? Ele não é livre. Por favor, use o GNU Ring ou o Tox. - Contato: https://libreplanet.org/wiki/User:Adfeno#vCard - Arquivos comuns aceitos (apenas sem DRM): Corel Draw, Microsoft Office, MP3, MP4, WMA, WMV. - Arquivos comuns aceitos e enviados: CSV, GNU Dia, GNU Emacs Org, GNU GIMP, Inkscape SVG, JPG, LibreOffice (padrão ODF), OGG, OPUS, PDF (apenas sem DRM), PNG, TXT, WEBM.
