Toolset for Collecting Shell Commands and Its Application in Hands-on 
Cybersecurity Training

By Valdemar Švábenský, Jan Vykopal, Daniel Tovarnák & Pavel Celeda.  Masaryk 
University Brno, Czech Republic

https://arxiv.org/pdf/2112.11118.pdf

Abstract:

This Full Paper in the Innovative Practice category presents and evaluates a 
technical innovation for hands-on classes.

When learning cybersecurity, operating systems, or networking, students perform 
practical tasks using a broad range
of command-line tools.

Collecting and analyzing data about the command usage can reveal valuable 
insights into how students progress and where they make mistakes. However, few 
learning environments support recording and inspecting command-line inputs, and 
setting up an efficient infrastructure for this purpose is challenging.

To aid engineering and computing educators, we share the design and 
implementation of an open-source toolset for logging commands that students 
execute on Linux machines. Compared to basic solutions, such as shell history 
files, the toolset’s novelty and added value are threefold.

First, its configuration is automated so that it can be easily used in classes 
on different topics. Second, it collects metadata about the command execution, 
such as a timestamp, hostname, and IP address. Third, all data are instantly 
forwarded to central storage in a unified, semi-structured format. This enables 
automated processing of the data, both in real-time and post hoc, to enhance 
the instructors’ understanding of student actions.

The toolset works independently of the teaching content, the training network’s 
topology, or the number of students working in parallel. We demonstrated the 
toolset’s value in two learning environments at four training sessions. Over 
two semesters, 50 students played educational cybersecurity games using a Linux 
command-line interface. Each training session lasted approximately two hours, 
during which we recorded 4439 shell commands. The semiautomated data analysis 
revealed different solution patterns, used tools, and misconceptions of 
students.

Our insights from creating the toolset and applying it in teaching practice are 
relevant for instructors, researchers, and developers of learning environments. 
We provide the software and data resulting from this work so that others can 
use them in their hands-on classes.

I. INTRODUCTION

Hands-on training is vital for gaining expertise in computing disciplines. 
Topics such as cybersecurity, operating systems, and networking must be 
practiced in a computer environment so that students can try various tools and 
techniques. Such a learning environment is called a sandbox. It contains 
networked hosts that may be intentionally vulnerable to allow practicing cyber 
attacks and defense. These skills are grounded in the current cybersecurity 
curricular guidelines [1] to address the increasing shortage of cybersecurity 
workforce [2].

For the training, each student receives an isolated sandbox hosted locally or 
in a cloud. To solve the training tasks, students work with many tools, both in 
a graphical user interface (GUI) and a command-line interface (CLI). This paper 
focuses on the Linux CLI, which is common in higher education of computing, as 
well as software development in the industry practice.

Analyzing CLI interactions opens opportunities for educational research and 
classroom innovation. In traditional face-to-face classes, instructors must 
look at the students’ computer screens to observe the learning process. 
However, this approach does not scale for large classes, and it becomes 
difficult for distance education.

Instead, if the students’ executed commands are logged, instructors and 
researchers may leverage them to support learning. By employing the methods of 
educational data mining [3] and learning analytics [4], the CLI data can help 
achieve important educational goals, such as to:

• better understand students’ approaches to learning, both in face-to-face and 
remote classes,
• objectively assess learning, and
• provide targeted instruction and feedback.

This paper examines the following research question relevant for instructors: 
What can we infer from students’ command histories that is indicative of their 
learning processes? Specifically, our goal is to understand how students solve 
cybersecurity assignments by analyzing their CLI usage. To address this 
question, we propose a generic method for collecting CLI logs from hands-on 
training. Then, we evaluate this method by gathering the logs from 50 students 
at four training sessions and investigating three sub-questions / use cases of 
the data:

1) What does the command distribution indicate about the students’ approach to 
solving the tasks? Our motivation
is to analyze which tools are commonly used and how effective they are with 
respect to the training tasks.

2) Which commands are used immediately after the student accesses the learning 
environment? We can observe if
the students started solving the initial task, familiarized themselves with the 
environment, or displayed off-task
behavior. This allows for providing suitable scaffolding.

3) How much time do students spend on the tasks, and how often do they attempt 
an action? Observing the time
differences between successive commands can indicate the students’ skill level 
and support assessment.

Although there are many cybersecurity learning environments, which we review in 
Section II, their logging support
is often limited or non-existent. The current solutions do not allow 
instructors to uniformly collect CLI data and metadata with minimal setup and 
then correlate the logs from multiple sandboxes for advanced analyses.

This paper addresses this gap by presenting and evaluating a technical 
innovation for hands-on classes that employ CLI tools. We created a toolset 
that collects Linux shell commands in physical or virtual learning environments 
and stores them in a unified format. Compared to the previous practice, where 
observing students’ learning was difficult or even impossible, the proposed 
innovation enables understanding student approaches at scale. It also allows 
employing educational data mining and learning analytics techniques to gain 
further insights.

The toolset design is explained in Section III. In Section IV, we introduce a 
study that deploys the toolset in practice
and evaluates it in authentic educational contexts. Section V presents the 
results of the study and addresses the questions above. Section VI discusses 
the study and proposes multiple research ideas that further leverage the 
collected data. Finally, Section VII summarizes our contributions. We also 
publish the toolset as open-source software. Instructors, researchers, and 
developers can use it to enhance computing classes, such as teaching 
cybersecurity, operating systems, and networking. (snip)


_______________________________________________
Link mailing list
[email protected]
https://mailman.anu.edu.au/mailman/listinfo/link

Reply via email to