Re: Applications always showing in pending state even after cluster restart

Gaurav Chhabra Fri, 03 Jul 2020 01:15:14 -0700

My sincere apology for responding too late. You're right Jonathan. That's
also a fine approach.


Anyways, i tied up with other activities so couldn't check whether the
issue still persisted. Though i had created proper security group rules and
my cluster was in stopped state, i didn't actually delete the pending
application queue last time when i stopped the cluster. Yesterday, when i
started the cluster, i saw that around 720 applications were queued up. I
killed those and didn't notice applications getting queued up further.
Today, when i again started the cluster, the queue was empty. So i can
confirm that it was '*indeed' an attack on my cluster earlier.*

*Regards.*

On Sat, 13 Jun 2020 at 17:04, Jonathan Aquilina <jaquil...@eagleeyet.net>
wrote:

> What you are saying is a bit of an easy fix.
>
>
>
> On the azure network security group lock down those public ip addresses to
> be accessible from your ip address or those ip addresses that are meant to
> have access to it.
>
>
>
> Regards,
>
> Jonathan Aquilina
>
> EagleEyeT
>
>
>
> Phone: +356 2033 0099
>
> Moblie + 356 7995 7942
>
> Email: sa...@eagleeyet.net
>
> Website: https://eagleeyet.net
>
>
>
> *From:* Gaurav Chhabra <varuag.chha...@gmail.com>
> *Sent:* 13 June 2020 11:45
> *To:* Hariharan <hariharan...@gmail.com>
> *Cc:* common-u...@hadoop.apache.org <user@hadoop.apache.org>
> *Subject:* Re: Applications always showing in pending state even after
> cluster restart
>
>
>
> Wow! What a guess, Hari! :) I wasn't sure those pending tasks could have
> been related to an attack. This happened with me from 1st to 5th June'20. I
> didn't check my Azure usage during that time though I was keeping tab
> almost every day in May. On 8th June (Mon), when i checked the charges, the
> Azure 'data transfer out' charges were showing $88, $90 & $110 for
> bigdataserver-{5,6,7} respectively. I was shocked as my last month charge
> was around $53. I opened a ticket with Azure and then we again started the
> cluster (with Azure networking guy along with me) and within 3-4 minutes,
> data transfer out again was around 10-12 GB in total (from 3 instances). We
> could only figure out that the hits were going to some blob storage in
> Azure. He said it most likely seems to be a virus or some attack.
>
>
>
> I have now removed public IPs from all instances except two instances (one
> where Cloudera Manager is hosted and another where Resource Manager is
> running). Even those two exposed ones are allowed incoming requests
> specifically from my laptop's IP. Things are fine now.
>
>
>
> One thing that i don't get is how's the attacker 'personally' benefitting
> from this except for obviously raising my monthly bill?
>
>
>
>
>
> Regards
>
>
>
>
>
>
>
> On Sat, 13 Jun 2020 at 11:00, Hariharan <hariharan...@gmail.com> wrote:
>
> This is most likely an attempt to attack your system. If you are running
> your cluster in the cloud, you should run it in a private network so it is
> not exposed to the Internet. Alternatively you can secure your installation
> as described here -
> https://blog.cloudera.com/how-to-secure-internet-exposed-apache-hadoop/
>
>
>
> Thanks,
>
> Hari
>
>
>
> On Fri, 12 Jun 2020, 12:20 Gaurav Chhabra, <varuag.chha...@gmail.com>
> wrote:
>
> Hi All,
>
>
>
>
> I have started learning Hadoop and its related components. I am following
> a tutorial on Hadoop Administration on Udemy. As part of the learning
> process, i ran the following command:
>
> $ hadoop jar
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jarrandomtextwriter
> -Ddfs.replication=1 /user/bigdata/randomtextwriter
>
> Above command created 30 files each of size 1 GB. Then i ran the below
> reduce command:
>
> $ yarn jar/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar \
> wordcount \
> -Dmapreduce.input.fileinputformat.split.minsize=268435456\
> -Dmapreduce.job.reduces=8 \
> /user/bigdata/randomtext \
> /user/bigdata/wordcount
>
> After executing the above command, I just thought of killing the
> application after some time so i ran 'yarn application -list' first which
> listed a lot many applications out of which one was *wordcount*. I killed
> that particular application using 'yarn application -kill
> application-id'. However, when i checked the scheduler, i could see that
> several applications were still showing in Pending state so i ran the
> following command:
>
>
> $ for x in $(yarn application -list -appStates ACCEPTED | awk 'NR > 2 {
> print $1 }'); do yarn application -kill $x; done
>
> It was killing the applications as I could see the 'Apps Completed' count
> was going up but as soon as all the apps got killed, I saw those
> applications again getting created. Even if I stop the whole cluster and
> start again, the scheduler shows that there are submitted applications in
> Pending state.
>
>
>
> Here's the content of fair-scheduler.xml:
>
> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
>
> <allocations>
>
>     <queue name="root">
>
>         <schedulingPolicy>drf</schedulingPolicy>
>
>         <queue name="default">
>
>             <schedulingPolicy>drf</schedulingPolicy>
>
>         </queue>
>
>     </queue>
>
>     <queuePlacementPolicy>
>
>         <rule name="specified" create="false"/>
>
>         <rule name="default" create="true"/>
>
>     </queuePlacementPolicy>
>
> </allocations>
>
> This is just a test cluster.  I just want to kill the applications/clear
> the application queue. Any help will really be appreciated as I am
> struggling with it for the last few days.
>
>
>
>
>
> Regards
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: user-h...@hadoop.apache.org
>
>

Re: Applications always showing in pending state even after cluster restart

Reply via email to