Re: [Trisquel-users] Need help with a Perl script

2016-06-08 Thread gnuser
It's working now! I even corrected a bug in the end of the script, it would  
never write "Finished name_of_video" and now it does (would give this back to  
the guys who wrote the script, but they never answered my emails so I don't  
think they even read them).
Also, I updated the curl parameters to look more like firefox, or in this  
case the latest TBB. Hope you guys enjoy it (if anyone needs it lol).


Also, I made some small tests with youtube-dl socks proxy support and it  
looks good. I will use it if necessary, but I am unsure what info youtube-dl  
might leak about system, version, date/time, etc. So far socks5 looks good.


Here is the script:

#!/usr/bin/perl -T

use strict;
use warnings;

#
##  Calomel.org  ,:,  Download Youtube videos
##Script Name : youtube_download.pl
##Version : 0.58
##Valid from  : March 2016
##URL Page: https://calomel.org/youtube_wget.html
##OS Support  : Linux, Mac OSX, OpenBSD, FreeBSD
#`:`
## Two arguments
##$1 Youtube URL from the browser
##$2 prefix to the file name of the video (optional)
#

  options  ##

# Option: what file type do you want to download? The string is used to  
search
# in the youtube URL so you can choose mp4, webm, avi or flv.  mp4 is the  
most

# compatable and plays on android, ipod, ipad, iphones, vlc and mplayer.
my $fileType = "mp4";

# Option: what visual resolution or quality do you want to download? List
# multiple values just in case the highest quality video is not available,  
the
# script will look for the next resolution. You can choose "itag=22" for  
720p,

# "itag=18" which means standard definition 640x380 and "itag=17" which is
# mobile resolution 144p (176x144). The script will always prefer to download
# the first listed resolution video format from the list if available.
my $resolution = "itag=22,itag=18";

# Option: How many times should the script retry if the download fails?
my $retryTimes = 2;

# Option: turn on DEBUG mode. Use this to reverse engineering this code if  
you are

# making changes or you are building your own youtube download script.
my $DEBUG=0;

#

# initialize global variables and sanitize the path
$ENV{PATH} = "/bin:/usr/bin:/usr/local/bin:/opt/local/bin";
my $prefix = "";
my $retry = 1;
my $retryCounter = 0;
my $user_url = "";
my $user_prefix = "";

# collect the URL from the command line argument
chomp($user_url = $ARGV[0]);
my $url = "$1" if ($user_url =~ m/^([a-zA-Z0-9\_\-\&\?\=\:\.\/]+)$/ or die  
"\nError: Illegal characters in YouTube URL\n\n" );


# declare the user defined file name prefix if specified
if (defined($ARGV[1])) {
   chomp($user_prefix = $ARGV[1]);
   $prefix = "$1" if ($user_prefix =~ m/^([a-zA-Z0-9\_\-\.\ ]+)$/ or die  
"\nError: Illegal characters in filename prefix\n\n" );

}

# if the url down below does not parse correctly we start over here
tryagain:

# make sure we are not in a tryagain loop by checking the counter
if ( $retryTimes < $retryCounter ) {
   print "\n\n Stopping the loop because the retryCounter has exceeded the  
retryTimes option.";
   print "\n The video may not be available at the requested resolution or  
may be copy protected.\n\n";

   print "\nretryTimes counter = $retryTimes\n\n" if ($DEBUG == 1);
   exit;
}

# download the html from the youtube page containing the page title and video
# url. The page title will be used for the local video file name and the url
# will be sanitized to download the video.
my $html = `curl  --socks5-hostname 127.0.0.1:9150 -A "Mozilla/5.0 (Windows  
NT 6.1; rv:45.0) Gecko/20100101 Firefox/45.0" -H "Accept-Language:  
en-us,en;q=0.5" -sS -L --compressed "$url"`  or die  "\nThere was a problem  
downloading the HTML page.\n\n";


# format the title of the page to use as the file name
my ($title) = $html =~ m/(.+)/si;
$title =~ s/[^\w\d]+/_/g or die "\nError: we could not find the title of the  
HTML page. Check the URL.\n\n";

$title = lc ($title);
$title =~ s/_youtube//ig;
$title =~ s/^_//ig;
$title =~ s/_amp//ig;
$title =~ s/_39_s/s/ig;
$title =~ s/_quot//ig;

# filter the URL of the video from the HTML page
my ($download) = $html =~ /"url_encoded_fmt_stream_map"(.*)/ig;

# Print the raw separated strings in the HTML page
#print "\n$download\n\n" if ($DEBUG == 1);

# This is where we loop through the HTML code and select the file type and
# video quality.
my @urls = split(',', $download);
OUTERLOOP:
foreach my $val (@urls) {
#   print "\n$val\n\n";

if ( $val =~ /$fileType/ ) {
   my @res = split(',', $resolution);
   foreach my $ress (@res) {
 if ( $val =~ /$ress/ ) {
 print "\n  html to url separation complete.\n\n" if ($DEBUG == 1);
 print "$val\n" if ($DEBUG == 1);
 $download = $val;
 last OUTERLOOP;
 }
   }
}
}

# clean up by translating url encoding and removing unwanted strings

Re: [Trisquel-users] Need help with a Perl script

2016-06-08 Thread gnuser

Thanks! That worked! I love to learn new stuff :)
However there seems to be a problem with the script itself (which may have  
been a mistake on my part) so I will have to check it better and let you guys  
know if it worked or not :)


Re: [Trisquel-users] Need help with a Perl script

2016-06-08 Thread gnuser
Because it's still experimental. I am not even sure if they have hostname  
being resolved over proxy or directly. CUrl on the other hand is a software  
that I have tried and used many times, and I have relatively confidence on  
it's socks implementation.
ALso, I am not even sure if youtube-dl current repo's version has it or not  
(though I suspect, given the fact that youtube-dl is updated by automatic  
updates almost every other day on Trisquel).
Thanks anyway, if I was into youtube-dl it would have been nice to know this  
:)


Re: [Trisquel-users] Need help with a Perl script

2016-06-08 Thread firefoxbugreporter

This gives you a clue
curl: option --socks5-hostname 127.0.0.1:9150: is unknown

Curl thinks that the whole thing is one parameter. You can get the same  
result by running

curl '--socks5-hostname 127.0.0.1:9150' localhost

What you should do is make it separate. You should try removing the quotes  
(") altogether, or putting both parts in separate quotes like this:

"--socks5-hostname" "127.0.0.1:9150"
or this:
"--socks5-hostname", "127.0.0.1:9150"
In the latter case the comma is part of the perl syntax.


Re: [Trisquel-users] Need help with a Perl script

2016-06-07 Thread danigaritarojas

Wait just sec.
Why are you using this script instead of using youtube-dl?
youtube-dl supports proxy too.

--proxy URL  Use the specified HTTP/HTTPS/SOCKS proxy.
 To enable experimental SOCKS proxy, specify
 a proper scheme. For example
 socks5://127.0.0.1:1080/. Pass in an empty
 string (--proxy "") for direct connection



Re: [Trisquel-users] Need help with a Perl script

2016-06-07 Thread gnuser

Wish it was that simple.
I had already tried that and it gave the same error.
Also, notice that the first curl occurrence is okay (just copy paste the  
script and give it a go, check what happens). The second one is where the  
script fails. It actually manages to do the "discover video title and url"  
thing very well, it just fails to download.


Re: [Trisquel-users] Need help with a Perl script

2016-06-07 Thread danigaritarojas

>"curl: option --socks5-hostname 127.0.0.1:9150: is unknown"
>"I think there is something wrong in the line:
system("curl", "-sSRL", "--socks5-hostname 127.0.0.1:9150", "-A 'Mozilla/5.0  
(Windows NT 6.1; rv:31.0) Gecko/20100101 Firefox/31.0'", "-H  
'Accept-Language: en-us,en;q=0.5'", "-o", "$filename", "--retry", "5", "-C",  
"-", "$download");

but i don't know what."

Why do you think that? Actually I too think there's something wrong with that  
line.

You added that "--socks5-hostname 127.0.0.1:9150", right?
Because I think it should be:
"--socks5-hostname '127.0.0.1:9150'"


Re: [Trisquel-users] Need help with a Perl script

2016-06-07 Thread gnuser

curl: option --socks5-hostname 127.0.0.1:9150: is unknown
curl: try 'curl --help' or 'curl --manual' for more information

Which is weird considering the first time curl runs it uses the tor proxy  
just fine (i checked and it is using the proxy, not bypassing it).

I think there is something wrong in the line:
system("curl", "-sSRL", "--socks5-hostname 127.0.0.1:9150", "-A 'Mozilla/5.0  
(Windows NT 6.1; rv:31.0) Gecko/20100101 Firefox/31.0'", "-H  
'Accept-Language: en-us,en;q=0.5'", "-o", "$filename", "--retry", "5", "-C",  
"-", "$download");

but i don't know what.


Re: [Trisquel-users] Need help with a Perl script

2016-06-07 Thread danigaritarojas

http://stackoverflow.com/

Also:
>"gives me an error"
What error?