it at your own risk. Any and all responsibility for any loss,
> damage or destruction of data or any other property which may arise from
> relying on this email's technical content is explicitly disclaimed. The
> author will in no case be liable for any monetary damages arising from
y that saved my day
> in the past when parsing phone numbers in Spark:
>
> https://github.com/google/libphonenumber
>
> If you combine it with Bjørn's suggestions you will have a good start on your
> linkage task.
>
> Best regards,
> Anastasios Zouzias
>
&g
ill in no case be liable for any monetary damages arising from such
> loss, damage or destruction.
>
>
>
> On Sat, 1 Apr 2023 at 19:32, Philippe de Rochambeau <mailto:phi...@free.fr>> wrote:
>> Hello,
>> I’m looking for an efficient way in Spark to sea
Hello,
I’m looking for an efficient way in Spark to search for a series of telephone
numbers, contained in a CSV file, in a data set column.
In pseudo code,
for tel in [tel1, tel2, …. tel40,000]
search for tel in dataset using .like(« %tel% »)
end for
I’m using the like function
ther analysis such as random forests or
> Markov chains then graphx alone will not help you much.
>
>> On 10. Feb 2018, at 15:49, Philippe de Rochambeau <phi...@free.fr> wrote:
>>
>> Hello,
>>
>> Let’s say a website log is structured as follows:
>&g
Hello,
Let’s say a website log is structured as follows:
;;;
eg.
2018-01-02 12:00:00;OKK;PAG1;1234555
2018-01-02 12:01:01;NEX;PAG1;1234555
2018-01-02 12:00:02;OKK;PAG1;5556667
2018-01-02 12:01:03;NEX;PAG1;5556667
where OKK stands for the OK Button on Page 1, NEX, the Next Button on Page 2, …
Thank you to you both, Jorge and Mich.
You've answered my questions in a quasi-realtime manner!
I will look into Flume and HDFS.
> Le 22 févr. 2016 à 22:41, Jorge Machado a écrit :
>
> To Get the that you could use Flume to ship the logs from the Servers to the
> HDFS for
Hello,
I have a few newbie questions regarding Spark.
Is Spark a good tool to process Web logs for attacks (or is it better to used a
more specialized tool)? If so, are there any plugins for this purpose?
Can you use Spark to weed out huge logs and extract only suspicious activities;
e.g., 1000
Hello,
I need to develop an application which:
- reads xml files in thousands of directories, two levels down, from year x to
year y
- extracts data from image tags in those files and stores them in a Sql or
NoSql database
- generates ImageMagick commands based on the extracted data to